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1. Background of the Invention. 

The present application is a continuation-in-part of U. S. Patent Application Serial 
Number 08/754,490, filed November 20, 1 996, the entire content of which is incorporated 
herein by reference. 

1.1 Field of the Invention 

The present invention provides new proteins for combatting insects, and 
particularly, coleopteran, dipteran, and lepidopteran insects sensitive to the disclosed 5* 
endotoxins derived from Bacillus thuringiensis. The invention provides novel chimeric 
crystal proteins and the chimeric cry gene segments which encode them, as well as 
methods for making and using these DNA segments, methods of producing the encoded 
proteins, methods for making synthetically-modified chimeric crystal proteins, and 
methods of making and using the synthetic crystal proteins. 

1.2 Description OF Related Art 

1 ,2.1 J8. THUJfUNGiENSIS CRYSTAL PROTEINS 

The Gram-positive soil bacterium B. thuringiensis is well known for its 
production of proteinaceous parasporal crystals, or 5-endotoxins, that are toxic to a 
variety of lepidopteran, coleopteran, and dipteran larvae. A thuringiensis produces 
crystal proteins during sporulation which are specifically toxic to certain species of 
insects. Many different strains of B. thuringiensis have been shown to produce 
insecticidal crystal proteins, and compositions comprising B. thuringiensis strains which 
produce proteins having insecticidal activity have been used conmiercially as 
environmentally-acceptable insecticides because of their toxicity to the specific target 
insect, and non-toxicity to plants and other non-targeted organisms. 

Conunercial formulations of naturally occurring B, thuringiensis isolates have 
long been used for the biological control of agricultural insect pests. In commercial 
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production, the spores and crystals obtained from the fermentation process are 
concentrated and formulated for foliar application according to conventional agricultural 
practices. 

1.2.2 Nomenclature of Crystal Proteins 

A review by Hofte et al, (1989) describes the general state of the art with respect 
to the majority of insecticidal B. thuringiensis strains that have been identified which are 
active against insects of the Order Lepidoptera, ie.^ caterpillar insects. This treatise also 
describes B. thuringiensis strains having insecticidal activity against insects of the Orders 
Diptera (i.e. flies and mosquitoes) and Coleoptera (i.e. beeties). A number of genes 
encoding crystal proteins have been cloned from several strains of B. thuringiensis. 
Hdfte et ai (1989) discusses the genes and proteins that were identified in B. 
thuringiensis prior to 1990, and sets forth the nomenclature and classification scheme 
which has traditionally been applied to B. thuringiensis genes and proteins, cry I genes 
encode lepidopteran*toxic Cryl proteins. cry2 genes encode Cry2 proteins that are toxic 
to both lepidopterans and dipterans. cry3 genes encode coleopteran-toxic Cry3 proteins, 
while cry4 genes encode dipteran-toxic Cry4 proteins, etc. 

Recentiy a new nomenclature has been proposed which systematically classifies 
the Cry proteins based upon amino acid sequence homology rather than upon insect target 
specificities. This classification scheme is sununarized in Table 1 . 
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Table 1 



Revised B. THURiNCiENSiS S-Enuotoxin Nomenclature'^ 



New 


Old 


GenBank Accession # 


CrylAa 


CrylA(a) 


Ml 1250 


CrylAb 


CryIA(b) 


M13898 


Cry I Ac 


CryIA(c) 


Ml 1068 


CrylAd 


CryIA(d) 


M73250 


CrylAe 


CryIA(e) 


M65252 


CrylBa 


CrylB 


X06711 


CrylBb 


ET5 


L32020 


Cry 1 Be 


PEGS 


Z46442 


CrylBd' 


CryEl ' 


U70726 


CrylCa 


CrylC 


X07518 


CrylCb 


CryIC(b) 


M97880 


CrylDa 


CrylD 


X54160 


CrylDb 


PrtB 


Z22511 


CrylEa 


CrylE 


X53985 


CrylEb 


CryIE(b) 


M73253 


Cry 1 Fa 


CrylF 


M63897 


CrylFb 


PrtD 


Z22512 


CrylGa 


PrtA 


Z22510 


CrylGb 


CryH2 . 


U70725 


Cry 1 Ha 


PrtC 


Z22513 


CrylHb 




U35780 


Crylla 


CryV 


X62821 


Cryllb 


CryV 


U07642 


CrylJa 


ET4 


L32019 


CrylJb 


ETl 


U31527 


CrylK 




U28801 


Cry2Aa 


CryllA 


M31738 


Cry2Ab 


CryllB 


M23724 


Cry2Ac 


CryllC 


X57252 


Cry3A 


CryUIA 


M22472 


Ciy3Ba 


CryinB 


X17123 


Cry3Bb 


CryIIIB2 


M89794 


Cry3C 


CrylllD 


X59797 


Cry4A 


CrylVA 


Y00423 


Cry4B 


CrylVB 


X07423 


CrySAa 


CryVA(a) 


L07025 


CrySAb 


CryVA(b) 


L07026 


CrySB 




U19725 


Cry6A 


Cry VIA 


L07022 
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New 


Old 


GenBank Accession # 


Cry6B 


CryVIB 


L07024 


Cry7Aa 


CrylllC 


M64478 


Cry7Ab 


CrylllCb 


U04367 


CrySA 


CrylllE 


U04364 


Cry8B 


CrylllG 


U04365 


CrySC 


CrylllF 


U04366 


Cry9A 


CrylG 


XSSI 20 


Cry9B 


CrylX 


X75019 


Cry9C 


CrylH 


Z37527 


Cry 1 OA 


CrylVC 


M12662 


CryllA 


CiylVD 


M31737 


CryllB 


Jeg80 


X86902 . . 


CryI2A 


CryVB 


L07027 


Cryl3A 


CryVC 


L07023 


Cryl4A 


CiyVD 


U13955 


CrylSA 


34kDa 


M76442 


Ciyl6A 


cbm71 


X94146 


Cryl7A 


cbm7l 


X99478 


Cryl8A 


CiyBPl 


X99049 


Cryl9A 


Jeg65 


Y08920 


CytlAa 


CytA 


X03182 


CytlAb 


CytM 


X98793 


CytlB 




U37196 


Cyt2A 


CytB 


Z14147 


Cyt2B 


CytB 


U52043 



'Adapted finom: http7/epunix.biols.susx.ac.uk/Home/NeU_Crickmore/Bt/inde 



1.2 J Mode of Crystal Protein Toxicity 

All S-endotoxin crystals are toxic to insect larvae by ingestion. Solubilization of 
the crystal in the midgut of the insect releases the protoxin form of the S-endotoxin 
which, in most instances, is subsequendy processed to an active toxin by midgut protease. 
The activated toxins recognize and bind to the brush-border of the insect midgut 
epithelium through receptor proteins. Several putative crystal protein receptors have been 
isolated from certain insect larvae (Knight et ai, 1995; Gill et aL, 1995; Masson et al, 
1995). The binding of active toxins is followed by intercalation and aggregation of toxin 
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molecules to form pores within the midgut epithelium. This process leads to osmotic 
imbalance, swelling, lysis of the cells lining the midgut epithelium, and eventual larvae 
mortality. 

S L2.4 Molecular Biology of 5-Endotoxins 

With the advent of molecular genetic techniques, various 6-endotoxin genes have 
been isolated and their DNA sequences determined. These genes have been used to 
construct certain genetically engineered B, thuringiensis products that have been 
£y}proved for conunerciai use. Recent developments have seen new 5-endotoxin delivery 
10 systems developed^ including plants that contain and express genetically engineered 
5-endotoxin genes. 

The cloning and sequencing of a number of 5-endotoxin genes from a variety of 
Bacillus thuringiensis strains have been described and are summarized by H6fte and 
Whiteley, 1989. Plasmid shutde vectors designed for the cloning and expression of 

15 6-endotoxin genes in E. coli or B. thuringiensis are described by Gawron-Burke and 
Baum (1991). U. S. Patent No. 5,441,884 discloses a site-specific recombination system 
for constructing recombinant B. thuringiensis strains containing 5-endotoxin genes that 
are free of DNA not native to B. thuringiensis. 

The Cryl family of crystal proteins, which are primarily active against 

20 lepidopteran pests, are the best studied class of S-endotoxins. The pro-toxin form of Cryl 
5-endotoxins consist of two approximately equal sized segments. The carboxyl-half, or 
pro-toxin segment, is not toxic and is thought to be important for crystal formation 
(Arvidson et al.^ 1989). The amino-half of the protoxin comprises the active-toxin 
segment of the Cryl molecule and may be further divided into three structural domains as 

25 determined by the recently described crystallographic structure for the active toxin 
segment of the CrylAa 5-endotoxin (Grochulski et al^ 199S). Domain 1 occupies the 
first third of the active toxin and is essential for channel formation (Thompson et al, 
1995). Domain 2 and domain 3 occupy the middle and last third of the active toxin. 
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respectively. Both domains 2 and 3 have been implicated in receptor binding and insect 
specificity, depending on the insect and 5-endotoxin being examined (Thompson et aL, 
1995). 

S 1.2.5 Chimeric Crystal Proteins 

In recent years, researchers have focused effort on the construction of hybrid 6- 
endotoxins with the hope of producing proteins with enhanced activity or improved 
properties. Advances in the art of molecular genetics over the past decade have 
facilitated a logical and orderly approach to engineering proteins with improved 

10 properties. Site-specific and random mutagenesis methods, the advent of polymerase 
chain reaction (PCR™) methodologies, and the development of recombinant methods for 
generating gene fusions and constructing chimeric proteins have facilitated an assortment 
of methods for changing amino acid sequences of proteins, fusing portions of two or 
more proteins together in a single recombinant protein, and altering genetic sequences 

15 that encode proteinis of commercial interest 

Unfortunately, for crystal proteins, these techniques have only been exploited in 
limited fashion. The likelihood of arbitrarily creating a chimeric protein with enhanced 
properties from portions of the numerous native proteins which have been identified is 
remote given the complex nature of protein structure, folding, oligomerization, activation, 

20 and correct processing of the chimeric protoxin to an active moiety. Only by careful 
selection of specific target regions within each protein, and subsequent protein 
engineering can toxins be synthesized which have improved insecticidal activity. 

Some success in the area, however, has been reported in the literature. For 
example, the construction of a few hybrid 5-endotoxins is reported in the foUov^ng 

25 related art: Intl. Pat Appl. Publ. No. WO 95/30753 discloses the construction of hybrid 
B. thuringiensis 5-endotoxins for production in Pseudomonas fluorescens in which the 
non-toxic protoxin fragment of Cry IF has been replaced by the non-toxic protoxin 
fragment from the Cryl Ac/Cryl Ab that is disclosed in U. S. Patent 5,128,130. 

U. S. Patent 5,128,130 discloses the construction of hybrid B. thuringiensis 

30 5-endotoxins for production in P, fluorescens in which a portion of the non-toxic protoxin 

-7- 

A: IOS779R9MBOI!.DOa 



segment of Cry I Ac is replaced with the conresponding non-toxic proloxin fragment of 
CrylAb. U. S. Patent 5,055,294 discloses the construction of a specific hybrid 
5-endotoxin between Cry 1 Ac (amino acid residues 1-466) and CrylAb (amino acid 
residues 466-1 1 55) for production in P. fluorescens. Although the aforementioned patent 
5 discloses the construction of a hybrid toxin within the active toxin segment, no specifics 
are presented in regard to the hybrid toxin's insecticidal activity. Intl. Pat. Appl. PubL 
No. WO 95/30752 discloses the construction of hybrid B. thuringiensis 8-endotoxins for 
production in P. fluorescens in which the non-toxic protoxin segment of CrylC is 
replaced by the nonrtoxic protoxin segment from CzylAb. The aforementioned 

10 application further discloses that the activity against Spodoptera exigua for the hybrid 
5-endotoxin is improved over that of the parent active toxin, CrylC. 

Intl. Pat. Appl. Publ. No. WO 95/06730 discloses the construction of a hybrid B, 
thuringiensis 5-endotoxin consisting of domains 1 and 2 of Cry IE coupled to domain 3 
and the non-toxic protoxin segment of CrylC. Insect bioassays performed against 

15 Manduca sexta (sensitive to CrylC and CrylE), Spodoptera exigua (sensitive to CrylC)," 
and Mamestra brassicae (sensitive to CrylC) show that the hybrid CrylE/CrylC hybrid 
toxin is active against M. sexta^ S. exigua, and M brassicae. The bioassay results were 
expressed as EC50 values (toxin concentration giving a 50% growth reduction) rather than 
LCso values (toxin concentration giving 50% mortality). Although the 5-endotoxins used 

20 for bioassay were produced in B. thuringiensis, only artificially-generated active 
segments of the 5-endotoxins were used, not the naturally-produced crystals typically 
produced by A thuringiensis that are present in commercial A thuringiensis 
formulations. Bioassay resiilts indicated that the LC50 values for the hybrid 
CrylE/CrylC crystal against S. firugiperda were 1.5 to 1.7 fold lower (more active) than 

25 for native CrylC. This art also discloses the construction of a hybrid B, thuringiensis 
5-endotoxin between Ciyl Ab (domains 1 and 2) and CiylC (domain 3 and the non-toxic 
protoxin segment), although no data are given regarding the hybrid toxin's activity or 
usefiilness. 
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Lee et aL (1995) report the construction of hybrid B, thuringiensis 5-endotoxins 
between Cry 1 Ac and CrylAa within the active toxin segment. Artificially generated 
active segments of the hybrid toxins were used to examine protein interactions in 
susceptible insect brush border membranes vesicles (BBMV). The bioactivity of the 
S hybrid toxins was not reported. 

Honee et a/. (1991) report the construction of hybrid 6-endotoxins between CrylC 
(domain 1) and CrylAb (domains 2 and 3) and the reciprocal hybrid between CrylAb 
(domain 1) and CrylC (domains 2 and 3). These hybrids failed to show any significant 
increase in activity against susceptible insects. Furthermore, the CrylC (domain 
10 l)/CrylAb (domains 2 and 3) hybrid toxin was found to be hypersensitive to protease 
degradation. A report by Schnepf ei al (1990) discloses the construction of CrylAc 
hybrid toxin in which a small portion of domain 2 was replaced by the corresponding 
region of CrylAa, although no significant increase in activity against susceptible insect 
larvae was observed. 

15 

13 DEnCIENClES IN THE PRIOR ART 

The limited 3uccesses in producing chimeric crystal proteins which have 
improved activity have negatively impacted the field by thwarting efforts to produce 
recombinantly-engineered crystal protein for commercial development, and to extend the 
20 toxic properties and host specificities of the known endotoxins. Therefore, what is lacking 
in the prior art are reliable methods and compositions comprising recombinantly- 
engineered crystal proteins which have improved insecticidal activity, broaid-host-range 
specificities, and which are suitable for commercial production in B, thuringiensis, 

25 2. Summary of the Invention 

The present invention overcomes these and other limitations in the prior art by 
providing novel chimeric 8-endotoxins which have improved insecticidal properties, and 
broad-range specificities. 

Disclosed are methods for the construction of B. thuringiensis hybrid 
30 5-endotoxins comprising amino acid sequences from native CrylAc and Cry IF crystal 
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proteins. These hybrid proteins, in which, ail or a portion of Cry 1 Ac domain 2, all or a 
portion of Cry 1 Ac domain 3, and all or a portion of the Cry 1 Ac protoxin segment is 
replaced by the corresponding portions of Cry IF, possess not only the insecticidal 
characteristics of the parent 5-endotoxins, but also have the unexpected and remarkable 

5 properties of enhanced broad-range specificity which is not proficiently displayed by 
either of the native 5-endotoxins from which the chimeric proteins were engineered. 

Specifically, the present invention discloses and claims genetically-engineered 
hybrid 5-endotoxins which comprise a portion of a Cry 1 Ac crystal protein fused to a 
portion ^of a Cry IF crystal protein. These chimeric endotoxins have broadrrange 

10 specificity for the insect pests described herein. 

In a fiirther embodiment, the present invention also discloses and claims 
recombinant B. thuringiensis hybrid 6-endotoxins which comprise a portion of Cryl Ab, 
Cry IF, and Cryl Ac in which all or a portion of CrylAb domain 2 or all or a portion of 
Cry 1 Ab domain 3 is replaced by the corresponding portions of CrylF and all or a portion 

15 of the CrylAb protoxin segment is replaced by the corresponding portions of Cryl Ac. 
Exemplary hybrid 5-endotoxins between CrylAb and CrylF are identified in 
SEQ ID NO: 13 and SEQ ID NO: 14. 

One aspect of the present invention demonstrates the unexpected result that 
certain hybrid 6-endotoxins derived from Cryl Ac and Cry IF proteins exhibit not only the 

20 insecticidal characteristics of the parent 5-endotoxins, but also possess insecticidal 
activity which is not proficiently displayed by either of the parent 5-endotoxins. 

Another aspect of the invention further demonstrates the unexpected result that 
certain chimeric CrylAh/Cry IF proteins maintain not only the insecticidal characteristics 
of the parent 5-endotoxins, but also exhibit insecticidal activity which is not displayed by 

25 either the native CrylAb or CrylF endotoxins. 

The present invention also encompasses Cryl Ac/Cry IF and CrylAb/CrylF 
hybrid 5-endotoxins that maintain the desirable characteristics needed for commercial 
production in B. thuringiensis: Specifically, the hybrid 5-endotoxins identified in 
SEQIDNO:10, SEQ IDNO:12, SEQIDN0:14, SEQIDNO:26, SEQIDNO:28, SEQ 

30 ID NO:30, and SEQ IDNO:34 can efficiently form proteinaceous parasporal inclusions 
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in B. thuringiensis and have the favorable characteristics of solubility, protease 
susceptibility, and insecticidal activity of the parent 5-endotoxins. 

In a further embodiment, the present invention also discloses and claims 
recombinant B. thuringiensis hybrid 5-endotoxins which comprise a portion of Cry 1 Ac 
and CrylC in which all or a portion of Cry I Ac domain 3 is replaced by the corresponding 
portions of CrylC and all or a portion of the Cry 1 Ac protoxin segment is replaced by the 
corresponding portion of CrylC. Exemplary hybrid S-endotoxins between Cry 1 Ac and 
CrylC are identified in SEQ ID NO:29 and SEQ ID NO:30. 

One aspect of the present invention, demonstrates the unexpected re;$ult that, 
although neither CrylAc nor CrylC possess S. frugiperda activity, the CrylAc/CrylC 
hybrid S-endotoxin identified by SEQ ID NO:29 and SEQ ID NO:30 has significant 
activity against 5. frugiperda. Furthermore, the Cry 1 Ac/Cry IC hybrid 5-endotoxin 
identified by SEQ ID NO:29 and SEQ ID NO:30 has significantly better activity against 
S. exigua than the CrylC parental 6-endotoxin. 

The present invention further pertains to the recombinant nucleic acid sequences 
which encode the novel crystal proteins disclosed herein. Specifically, the invention 
discloses and claims the nucleic acid sequences of SEQIDNO:9, SEQIDN0:11, 
SEQ ID NO:13, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, and SEQ ID NO:33; 
nucleic acid sequences which are complementary to the nucleic acid sequences of 
SEQ ID NO:9, SEQ ID N0:1 1, SEQ ID NO:13, SEQ ID NO:25, SEQ ID NO:27, SEQ ID 
NO:29; and SEQ ID NO:33, and nucleic acid sequences which hybridize to the sequences 
of SEQIDNO:9, SEQIDN0:11, SEQIDN0:13, SEQIDNO:25, SEOIDNO:27, 
SEQ ID NO:29, and SEQ ID NO:33. 

The novel hybrid 5-endotoxins disclosed herein are useful in the control of a 
broad range of insect pests. These hybrid S-endotoxins are described in FIG. 1 and FIG. 
4 and are disclosed in SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO:26, 
SEQIDNO:28, SEQ ID NO:30, and SEQ ID NO:34. The nucleic acid segments 
encoding these proteins are disclosed in SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO:13, 
SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, and SEQ ID NO:33. The insecticidal 
and biochemical properties of the hybrid S-endotoxins are described in FIG. 2, FIG. 3, 
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and Table 4, Table 5, Table 6, and Table 7. The broad host range of the improved 

5- endotoxins specified in the present invention is useful in circumventing dilution effects 
caused by expressing multiple 5-endotoxin genes within a single B, thuringiensis strain. 
Expression of such a broad host range 5-endotoxin in plants is expected to impart 

5 protection against a wider variety of insect pests. 

The impetus for constructing these and other hybrid 5-endotoxins is to create 
novel toxins with improved insecticidai activity, increased host-range specificity, and 
improved production characteristics. The DNA sequences listed in Table 7 define the 
exchange points for the hybrid 5-endotoxins pertinent to the present invention and as 

10 oligonucleotide primers, may be used to identify like or similar hybrid 5-endotoxins by 
Southern or colony hybridization under conditions of moderate to high stringency. 
Researchers skilled in the art will recognize the importance of the exchange site chosen 
between two or more 6-endotoxins can be achieved using a number of in vivo or in vitro 
molecular genetic techniques. Small variations in the exchange region between two or 

15 more 5-endotoxins may yield similar results or, as demonstrated for EG 11062 and" 
EG 11063, adversely affect desirable traits. Similarly, large variations in the exchange 
region between two or more 5-endotoxins may have no effect on desired traits, as 
demonstrated by EG11063 and EG11074, or may adversely affect desirable traits, as 
demonstrated by EGl 1060 and EGl 1063. 

20 Favorable traits with regard to improved insecticidai activity, increased host 

range, and improved production characteristics may be achieved by other such hybrid 

6- endotoxins including, but not limited to, the cryl, cry2, cry3, cry4, crySl cry6, cry7, 
cryS, cry9, crylO, cryll, cryl2, cryI3, cryl4, cry 1 5 class of 5-endotoxin genes and the B. 
thuringiensis cytolytic cytl and cyt2 genes. Members of these classes of B. thuringiensis 

25 insecticidai proteins include, but are not limited to CrylAa, CrylAb, CrylAc, CrylAd, 
CrylAe, CiylBa, CrylBb, CrylCa, CrylCb, CrylDa, CrylDb, CrylEa, CrylEb, 
CrylFa, CrylFb, CrylGa, CrylHa, Cry2a, Cry2b, CrylJa,-CrylKa, CryllAa, CryllAb. 
Cryl2Aa, Cry3Ba, Cry3Bb, Cry3C, Cry4a, Cry4Ba, Cry5a, CrySAb, Cry6Aa, Cry6Ba, 
Cry7Aa, Cry7Ab, CrySAa, Cry8Ba, CrySCa, Cry9Aa, Cry9Ba, Cry9Ca, CrylOAa, 

30 Cryl 1 Aa, Cryl2Aa, Cryl3Aa, Cry 14Aa, Cryl5Aa, CytlAa, and Cyt2Aa. Related hybrid 
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S-endotoxins would consist of the amino portion of one of the aforementioned 
5-endotoxins, including all or part of domain i or domain 2, fused to all or part of domain 
3 from another of the aforementioned 5-endotoxins. The non-active protoxin fragment of 
such hybrid 5-endotoxins may consist of the protoxin fragment from any of the 
aforementioned 5-endotoxins which may act to stabilize the hybrid 5-endoloxin as 
demonstrated by EG11087 and EG11091 (see e.g., Table 4). Hybrid 5-endotoxins 
possessing similar traits as those described in the present invention could be constructed 
by conservative, or "similar" replacements of amino acids within hybrid 5-endotoxins. 
Such substitutions would mimic the biochemical and biophysicaji properties of the native 
amino acid at any position in the protein. Amino acids considered similar include for 
example, but are not limited to: 

Ala, Ser, and Thr; 

Asp and Glu; 

Asn and Gin; 

Lys and Arg; 

He, Leu, Met, and Val; and 
Phe, Tyr, and Trp. 

Researchers skilled in the art will recognize that improved insecticidal activity, 
increased host range, and improved production characteristics imparted upon hybrid 
5-endotoxins may be further improved by altering the genetic code for one or more amino 
acid positions in the hybrid 5-endotoxin such that the position, or positions, is replaced by 
any other amino acid. This may be accomplished by targeting a region or regions of the 
protein for mutagenesis by any number of established mutagenic techniques, including 
those procedures relevant to the present invention. Such techniques include site-specific 
mutagenesis (Kunkle, 1985; Kunkle et al, 1987), DNA shufQing (Stenuner, 1994), and 
PGR™ overlap extension (Horton et ai, 1989). Since amino acids situated at or near the 
surface of a protein are likely responsible for its interaction with other proteinaceous or 
non-protemaceous moieties, they may serve as "target" regions for mutagenesis. Such 
surface exposed regions may consist of, but not be limited to, surface exposed amino acid 
residues within the active toxin fragment of the protein and include the inter-a-helical or 
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inter-p-strand "loop" -regions of 5-endotoxins that separate a*helices within domain 1 
and p-strands within domain 2 and domain 3. Such procedures may favorably change the 
protein's biochemical and biophysical characteristics or its mode of action as outlined in 
the Section 1. These include, but are not limited to: 1) improved crystal formation, 2) 
improved protein stability or reduced protease degradation, 3) improved insect membrane 
receptor recognition and binding, 4) improved oligomerization or chaimel formation in 
the insect midgut endothelium, and 5) improved insecticidal activity or insecticidal 
specificity due to any or all of the reasons stated above. 

2.1 Crystal Protein Transgenes and Transgenic Plants 

In yet another aspect, the present invention provides methods for producing a 
transgenic plant which expresses a nucleic acid segment encoding the novel chimeric 
crystal proteins of the present invention. The process of producing transgenic plants is 
well-known in the art In general, the method comprises transforming a suitable host cell 
with a DNA segment which contains a promoter operatively linked to a coding region 
that encodes a B. thuringiensis Cryl Ac-IF or Cry 1 Ab-IF, Cryl Ac-lC, or a Cry 1 Ab-1 Ac- 
IF chimeric crystal protein. Such a coding region is generally operatively linked to a 
transcription-terminating region, whereby the promoter is capable of driving the 
transcription of the coding region in the cell, and hence providing the cell the ability to 
produce the recombinant protein in vivo. Alternatively, in instances where it is desirable 
to control, regulate, or decrease the amount of a particular recombinant crystal protein 
expressed in a particular transgenic cell, the invention also provides for the expression of 
crystal protein antisense mRNA. The use of antisense mRNA as a means of controlling 
or decreasing the amount of a given protein of interest in a cell is well-known in the art. 

Another aspect of the invention comprises a transgenic plant which express a gene 
or gene segment encoding one or more of the novel polypeptide compositions disclosed 
herein. As used herein, the term "transgenic plant" is intended to refer to a plant that has 
incorporated DNA sequences, including but not limited to genes which are perhaps not 
normally present, DNA sequences not normally transcribed into RNA or translated into a 
protein ("expressed"), or any other genes or DNA sequences which one desires to 
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introduce into the non-transformed plant, such as genes which may normally be present in 
the non-transformed plant but which one desires to either genetically engineer or to have 
altered expression. The construction and expression of synthetic B. thuringiensis genes 
in plants has been described in detail in U. S. Patents 5,500,365 and 5,380,831 (each 

5 specifically incorporated herein by reference). 

It is contemplated that in some instances the genome of a transgenic plant of the 
present invention will have been augmented through the stable introduction of one or 
more crylAc-IF, crylAb-IF, crylAc-lC, or crylAb-lAc-lF transgenes, either native, 
synthetically-modified, or further mutated. In some instances, more than one transgene 

10 will be incorporated into the genome of the transformed host plant cell. Such is the case 
vAien more than one crystal protein-encodmg DNA segment is incorporated into the 
genome of such a plant. In certain situations, it may be desirable to have one, two, three, 
four, or even more B. thuringiensis crystal proteins (either native or recombinantly- 
engineered) incorporated and stably expressed in the transformed transgenic plant 

15 A preferred gene, such as those disclosed in SEQIDNO:9, SEQIDNOrll,' 

SEQIDN0:13, SEQIDNO:25, SEQIDNO:27, SEQIDNO:29, and SEQIDNO:33 
vAdch may be mtroduced includes, for example, a crystal protein-encoding a DNA 
sequence fiom bacterial origin, and particularly one or more of those described herein 
which are obtained from Bacillus spp. Highly preferred nucleic acid sequences are those 

20 obtained from B, thuringiensis^ or any of those sequences ^^ch have been genetically 
engineered to decrease or increase the insecticidal activity of the crystal protein in such a 
transformed host cell. 

Means for transforming a plant cell and the preparation of a transgenic cell line 
are well-known in the ait, and are discussed herein. Vectors, plasmids, cosmids, yeast 

25 artificial chromosomes (YACs) and nucleic acid segments for use in transforming such 
cells will, of course, generally comprise either the operons, genes, or gene-derived 
sequences of the present invention, either native, or synthetically-derived, and 
particularly those encoding the disclosed crystal proteins. These DNA constructs can 
further include structures such as promoters, enhancers, polylinkers, or even gene 

30 sequences which have positively- or negatively-regulating activity upon the particular 



A t05719(29MBOI!.OOC) 



-15- 



genes of interest as desired. The DNA segment or gene may encode either a native or 
modified crystal protein, which will be expressed in the resultant recombinant cells, 
and/or which will impart an improved phenotype to the regenerated plant. Nucleic acid 
sequences optimized for expression in plants have been disclosed in Intl. Pat Appl. PubL 
5 No. WO 93/07278 (specifically incorporated herein by reference). 

Such transgenic plants may be desirable for increasing the insecticidal resistance 
of a monocotyledonous or dicotyledonous plant, by incorporating into such a plant, a 
transgenic DNA segment encoding CrylAc-lF and/or CrylAc-lC, and/or CrylAb-lF 
and/or ^ Cry lAb-lAc-lF . crystal protein(s) . which pos$ess broad-insect, specificity. 

10 Particularly preferred plants such as grains, including but not linuted to com, wheat, oats, 
rice, maize, and barley; cotton; soybeans and other legumes; trees, including but not 
limited to ornamentals, shnibs, fiiiits, nuts; vegetables, turf and pasture grasses, berries, 
citrus, and other crops of conunercial interest; such as garden crops and/or houseplants, 
succulents, cacti, and flowering species. 

IS In a related aspect, the present invention also encompasses a seed produced by the 

transformed plant, a progeny firom such seed, and a seed produced by the progeny of the 
original transgenic plant, produced in accordance with the above process. Such progeny 
and seeds will have a stably crystal protein transgene stably incorporated into its genome, 
and such progeny plants will inherit the traits afforded by the introduction of a stable 

20 transgene in Mendelian fashion. Ail such transgenic plants having incorporated into their 
genome transgenic DNA segments encoding one or more chimeric crystal proteins or 
polypeptides are aspects of this invention. 

2 J, Crystal Protein Screening and Immunodetection Kits 
2S The present invention contemplates methods and kits for screening samples 

suspected of containing crystal protein polypeptides or crystal protein-related 
polypeptides, or cells producing such polypeptides. Exemplary proteins include those 
disclosed in SEQIDNOrlO, SEQIDN0:12, SEQIDNO:14, SEQIDNO:26, 
SEQ ID NO:28, SEQ ID NO:30, and SEQ ID NO:34. Said kit can contain a nucleic acid 
30 segment or an antibody of the present invention. The kit can contain reagents for 
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detecting an interaction between a sample and a nucleic acid or antibody of the present 
invention. The provided reagent can be radio-, fluorescently- or enzymatically-labeled. 
The kit can contain a known radiolabeled agent capable of binding or interacting with a 
nucleic acid or antibody of the present invention. 

The reagent of the kit can be provided as a liquid solution, attached to a solid 
support or as a dried powder. Preferably, when the reagent is provided in a liquid 
solution, the liquid sbiution is an aqueous solution. Preferably, when the reagent 
provided is attached to a solid support, the solid support can be chromatograph media, a 
test plate having a plurality of wells, or a microscope slide.. When the reagent provided is 
a dry powder, the powder can be reconstituted by the addition of a suitable solvent, that 
may be provided. 

In still further embodiments, the present invention concerns immunodetection 
methods and associated kits. It is proposed that the crystal proteins or peptides of the 
present invention may be employed to detect antibodies having reactivity therewith, or, 
alternatively, antibodies prepared in accordance with the present invention, may be 
employed to detect crystal proteins or crystal protein-related epitope-containing peptides. 
In general, these methods will include first obtaining a sample suspected of containing 
such a protein, peptide or antibody, contacting the sample with an antibody or peptide in 
accordance with the present invention, as the case may be, under conditions effective to 
allow the formation of an immunocomplex, and then detecting the presence of the 
immunocomplex. 

In general, the detection of immunocomplex formation is quite well known in the 
art and may be achieved through the application of numerous approaches. For example, 
the presait invention contemplates the application of ELISA, RIA, immunoblot (e.g.. dot 
blot), iih£iect immunofluorescence techniques and the like. Generally, immunocomplex 
formation will be detected through the use of a label, such as a radiolabel or an enzyme 
tag (such as alkaline phosphatase, horseradish peroxidase, or the like). Of course, one 
may find additional advantages through the use of a secondary binding ligand such as a 
second antibody or a biotin/avidin ligand binding arrangement, as is known in the art. 
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For assaying purposes, it is proposed that virtually any sample suspected of 
comprising either a crystal protein or peptide or a crystal protein-related peptide or 
antibody sought to be detected, as the case may be, may be employed. It is contemplated 
that such embodiments may have application in the titering of antigen or antibody 
S samples, in the selection of hybridomas, and the like. In related embodiments, the 
present invention contemplates the preparation of kits that may be employed to detect the 
presence of crystal proteins or related peptides and/or antibodies in a sample. Samples 
may include cells, cell siq)ematants, cell suspensions, cell extracts, enzyme fractions, 
protein extracts, or other cell-free compositions suspected of containing crystal proteins 

10 or peptides. Generally speaking, kits in accordance with the present invention will 
include a suitable crystal protein, peptide or an antibody directed against such a protein or 
peptide, together with an immunodetection reagent and a means for containing the 
antibody or antigen and reagent The immunodetection reagent will typically comprise a 
label associated with the antibody or antigen, or associated with a secondary binding 

IS ligand. Exemplary ligands niight include a secondary antibody directed against the first 
antibody or antigen or a biotin or avidin (or streptavidin) ligand having an associated 
label. Of course, as noted above, a number of exemplary labels are known in the art and 
all such labels may be employed in connection with the present invention. 

The container will generally include a vial into which the antibody, antigen or 

20 detection reagent may be placed, and preferably suitably aliquotted. The kits of the 
present invention will also typically include a means for containing the antibody, antigen, 
and reagent containers in close confinement for commercial sale. Such containers may 
include injection or blow-molded plastic containers into which the desired vials are 
retained 

25 

23 ELISASANDlMMUNOPRECIPITATION 

ELISAs may be used in conjunction with the invention. In an ELISA assay, 
proteins or peptides incorporating crystal protein antigen sequences are immobilized onto 
a selected surface, preferably a surface exhibiting a protein afiRnity such as the wells of a 
30 polystyrene microtiter plate. After washing to remove incompletely adsorbed material, it 
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is desirable to bind or coat the assay plate wells with a nonspecific protein that is icnown 
to be antigenically neutral with regard to the test antisera such as bovine serum albumin 
(BSA), casein or solutions of milk powder. This allows for blocking of nonspecific 
adsorption sites on the immobilizing surface and thus reduces the background caused by 
S nonspecific binding of antisera onto the surface. 

After binding of antigenic material to the well, coating with a non-reactive 
material to reduce background, and washing to remove unbound material, the 
inunobilizing sur&ce is contacted with the antisera or clinical or biological extract to be 
tested in a manner conducive to immune complex (antigen/antibody) formation: Such 

10 conditions preferably include diluting the antisera with diluents such as BSA, bovine 
gamma globulin (BGG) and phosphate buffered saline (PBS)n'ween®. These added 
agents also tend to assist in the reduction of nonspecific background. The layered 
antisera is then allowed to incubate for from about 2 to about 4 hours, at temperatures 
preferably on the order of about IS"" to about IT'C. Following incubation, the antisera- 

IS contacted surface is washed so as to remove non-immunocomplexed material. A 
preferred washing procedure includes washing with a solution such as PBS/Tween®, or 
borate buffer. 

Following formation of specific immunocomplexes between the test sample and 
the bound antigen, and subsequent washing, the occurrence and even amount of 

20 immunocomplex formation may be determined by subjecting same to a second antibody 
having specificity for the first. To provide a detecting means, the second antibody will 
preferably have an associated enzyme that will generate a color development upon 
incubating with an appropriate chromogenic substrate. Thus, for example, one will desire 
to contact and incubate the antisera-bound surface with a urease or peroxidase-conjugated 

25 anti-^niman IgO for a period of time and under conditions vfbich favor the development of 
immunocomplex formation (e.g., incubation for 2 hours at room temperature in a PBS- 
containing solution such as PBS-Tween^. 

Afler incubation with the second enzyme-tagged antibody, and subsequent to 
washing to remove unbound material, the amount of label is quantified by incubation 

30 with a chromogenic substrate such as urea and bromocresol purple or 2, 2'-azino-di-(3- 
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ethyl-benzthiazoline)-6-sulfonic acid (ABTS) and H2O2, in the case of peroxidase as the 
enzyme label. Quantitation is then achieved by measuring the degree of color generation, 
e.g., using a visible spectra spectrophotometer. 

The anti-crystal protein antibodies of the present invention are particularly useful 
5 for the isolation of other crystal protein antigens by immunoprecipitation. 
Immunoprecipitation involves the separation of the target antigen component from a 
complex mixture, and is used to discriminate or isolate minute amounts of protein. For 
the isolation of membrane proteins cells must be solubilized into detergent micelles. 
Nonionic salts are preferred, since other agents such as bile salts, precipitate at acid pH or 
10 in the presence of bivalent cations. 

In an alternative embodiment the antibodies of the present invention are useful for 
the close juxtaposition of two antigens. This is particularly useful for increasing the 
localized concentration of antigens, e.g. enzyme-substrate pairs. 

1 S 2.4 Western Blots 

The compositions of the present invention will find great use in immunoblot or 
western blot analysis. The anti-peptide antibodies may be used as high-afBnity primary 
reagents for the identification of proteins immobilized onto a solid support matrix, such 
as nitrocellulose, nylon or combinations thereof. In conjunction with 

20 immunoprecipitation, followed by gel electrophoresis, these may be used as a single step 
reagent for use in detecting antigens against which secondary reagents used in the 
detection of the antigen cause an adverse background. This is especially useful when the 
antigens studied are immunoglobulins (precluding the use of immunoglobulins binding 
bacterial cell wall components), the antigens studied cross-react with the detecting agent, 

25 or they migrate at the same relative molecular weight as a cross-reacting signal. 

bnmunologically-based detection methods for use in conjunction with Western 
blotting include enzymatically-, radiolabel-, or fluorescentiy-tagged secondary antibodies 
against the toxin moiety are considered to be of particular use in this regard. 
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2,5 Epitopic Core Sequences 

The present invention is also directed to protein or peptide compositions, free 
from total cells and other peptides, which comprise a purified protein or peptide which 
incorporates an epitope that is immunologically cross-reactive with one or more anti- 

5 crystal protein antibodies. In particular, the invention concerns epitopic core sequences 
derived from Cry proteins or peptides. 

As used herein, the term "incorporating an epitope(s) that is inununologically 
cross-reactive with one or more anti-crystal protein antibodies" is intended to refer to a 
.peptide OF protein antigen which, includes a primary,, secondary or.terdary structure 

10 similar to an epitope located within a crystal protein or polypeptide. The level of 
similarity will generally be to such a degree that monoclonal or polyclonal antibodies 
directed against the crystal protein or polypeptide will also bind to, react with, or 
otherwise recognize, the cross-reactive peptide or protein antigen. Various immunoassay 
methods may be employed in conjunction with such antibodies, such as, for example, 

IS Western blotting, ELISA, RIA, and the like, all of which are known to those of skill in 
the art. 

The identification of Cry immunodominant epitopes, and/or their functional 
equivalents, suitable for use in vaccines is a relatively straightforward matter. For 
example, one may employ the methods of Hbpp, as taught in U. S. Patent 4,554,101, 

20 incorporated herein by reference, vdiich teaches the identification and preparation of 
epitopes from amino acid sequences on the basis of hydrophilicity. The methods 
described in several other papers, and software programs based thereon, can 'also be used 
to identify epitopic core sequences (see, for example, Jameson and Wolf, 1988; Wolfe/ 
a/., 1988; U. S. Patent 4,554,101). The amino acid sequence of these "epitopic core 

25 sequences" may then be readily incorporated into peptides, either through the application 
of peptide synthesis or recombinant technology. 

Preferred peptides for use in accordance with the present invention will generally 
be on the order of about 8 to about 20 amino acids in length, and more preferably about 8 
to about 15 amino acids in length. It is proposed that shorter antigenic crystal protein- 

30 derived peptides will provide advantages in certain circumstances, for example, in the 
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preparation of immunologic detection assays. Exemplary advantages include the ease of 
prepiaration and purification, the relatively low cost and improved reproducibility of 
production, and advantageous biodistribution. 

It is proposed that particular advantages of the present invention may be realized 
5 through the preparation of synthetic peptides which include modified and/or extended 
epitopic/inmxunogenic core sequences which result in a "universal" epitopic peptide 
directed to crystal proteins, and in particular Cry and Cry-related sequences. These 
epitopic core sequences are identified herein in particular aspects as hydrophilic regions 
of the particular polypeptide antigen. It is proposed that these regions represent those 

10 which are most likely to promote T-cell or B-cell stimulation, and, hence, elicit specific 
antibody production. 

An epitopic core sequence, as used herein, is a relatively short stretch of amino 
acids that is "complementary" to, and therefore will bind, antigen binding sites on the 
crystal protein-directed antibodies disclosed herein. Additionally or alternatively, an 

IS epitopic core sequence is one that will elicit antibodies that are cross-reactive with 
antibodies directed against the peptide compositions of the present invention. It will be 
understood that in the context of the present disclosure, the term "complementary" refers 
to amino acids or peptides that exhibit an attractive force towards each other. Thus, 
certain epitope core sequences of the present invention may be operationally defined in 

20 terms of their ability to compete with or perhaps displace the binding of the desired 
protein antigen with the corresponding protein-directed antisera. 

In general, the size of the polypeptide antigen is not believed to be particularly 
crucial, so long as it is at least large enough to carry the identified core sequence or 
sequences. The smallest useful core sequence anticipated by the present disclosure would 

25 generally be on the order of about 8 amino acids in length, with sequences on the order of 
10 to 20 being more preferred. Thus, this size will generally correspond to the smallest 
peptide antigens prepared in accordance with the invention. However, the size of the 
antigen may be larger where desired, so long as it contains a basic epitopic core sequence. 
The identification of epitopic core sequences is known to those of skill in the art, 

30 for example, as described in U. S. Patent 4,554,101, incorporated herein by reference, 
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which teaches the identification and preparation of epitopes from amino acid sequences 
on the basis of hydrophilicity. Moreover, numerous computer programs are available for 
use in predicting antigenic portions of proteins (see e.g.. Jameson and Wolf, 1988; Wolf 
et aly 1988)* Computerized peptide sequence analysis programs {e.g., DNAStar® 

5 software, DNAStar, Inc., Madison, WI) may also be useful in designing synthetic 
peptides in accordance with the present disclosure. 

Syntheses of epitopic sequences, or peptides which include an antigenic epitope 
within their sequence, are readily achieved using conventional synthetic techniques such 
as the solid phase method .(e.g., through the use of commercially available peptide 

10 synthesizer such as an Applied Biosystems Model 430A Peptide Synthesizer). Peptide 
antigens synthesized in this manner may then be aliquotted in predetermined amounts and 
stored in conventional manners, such as in aqueous solutions or, even more preferably, in 
a powder or lyophilized state pending use. 

In general, due to the relative stability of peptides, they may be readily stored in 

1 5 aqueous solutions for fairly long periods of time if desired, e.g. , up to six months or more, 
in virtually any aqueous solution without appreciable degradation or loss of antigenic 
activity. However, where extended aqueous storage is contemplated it will generally be 
desirable to include agents including buffers such as Tris or phosphate buffers to maintain 
a pH of about 7.0 to about 7.5. Moreover, it may be desirable to include agents which 

20 will inhibit microbial growth, such as sodium azide or Merthiolate. For extended storage 
in an aqueous state it will be desirable to store the solutions at about 4^C, or more 
preferably, fix>zen. Of course, where the peptides are stored in a lyophilized or powdered 
state, they may be stored virtually indefinitely, e.g., in metered aliquots that may be 
rehydrated widi a predetermined amount of water (preferably distilled) or buffer prior to 

25 use. 

2.6 Nucleic Acid Segments Encoding Crystal Protein Chimeras 

The present invention also concerns DNA segments, both native, synthetic, and 
mutagenized, that can be synthesized, or isolated from virtually any source, that are free 
30 from total genomic DNA and that encode the novel chimeric peptides disclosed herein. 
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DNA segments encoding these peptide species may prove to encode proteins, 
polypeptides, subunits, functional domains, and the like of crystal protein-related or other 
non-related gene products. In addition these ONA segments may be synthesized entirely 
in vitro using methods that are well-known to those of skill in the art. 
5 As used herein, the term "DNA segment" refers to a DNA molecule that has been 

isolated free of total genomic DNA of a particular species. Therefore, a DNA segment 
encoding a crystal protein or peptide refers to a DNA segment that contains crystal 
protein coding sequences yet is isolated away from, or purified free from, total genomic 
. DNA of the species from which the DNA segment is obtained, which in the instant case 
10 is the genome of the Gram-positive bacterial genus. Bacillus j and in particular, the 
species of Bacillus known as B, thuringiensis. Included within the term "DNA segment", 
are DNA segments and smaller fragments of such segments, and also recombinant 
vectors, including, for example, plasmids, cosmids, phagemids, phage, viruses, and the 
like. 

15 Similarly, a DNA segment comprising an isolated or purified crystal protein- 

encoding gene refers to a DNA segment which may include in addition to peptide 
encoding sequences, certain other elements such as, regulatory sequences, isolated 
substantially away from other naturally occurring genes or protein-encoding sequences. 
In this respect, the term "gene" is used for simplicity to refer to a functional protein-, 

20 polypeptide- or peptide-encoding unit As will be understood by those in the art, this 
functional term includes both genomic sequences, operon sequences and smaller 
engineered gene segments that express, or may be adapted to express, proteins, 
polypeptides or peptides. 

"Isolated substantially away from other coding sequences" means that the gene of 

25 interest, in this case, a gene encoding a bacterial crystal protein, forms the significant part 
of the coding region of the DNA segment, and that the DNA segment does not contain 
large portions of naturally-occurring coding DNA, such as large chromosomal fragments 
or other fimctional genes or operon coding regions. Of course, this refers to the DNA 
segment as originally isolated, and does not exclude genes, recombinant genes, synthetic 

30 linkers, or coding regions later added to the segment by the hand of man. 
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In particular embodiments, the invention concerns isolated DNA segments and 
recombinant vectors incorporating DNA sequences that encode a Cry peptide species that 
includes within its amino acid sequence an amino acid sequence essentially as set forth in 
SEQIDNOilO, SEQIDNO:12, SEQIDN0:14, SEQIDNO:26, SEQIDNO:28, 

5 SEQ ID NO:30, SEQ ID NO:34. 

The term "a sequence essentially as set forth in SEQ ID NO: 10, SEQ ID NO: 12, 
SEQ ID NO: 14, SEQIDNO:26, SEQIDNO:28, SEQIDNO:30, or SEQIDNO:34" 
means that the sequence substantially corresponds to a portion of the sequence of either 
SEQIDNOrlO, SEQIDNO:I2, SEQIDN0:14, SEQIDN6:26, SEQIDNO:28, 

10 SEQIDNO:30, or SEQIDNO:34 and has relatively few amino acids that are not 
identical to, or a biologically functional equivalent of, the amino acids of any of these 
sequences. The term "biologically functional equivalent" is well understood in the art 
and is further defined in detail herein (e.g., see Illustrative Embodiments). Accordingly, 
sequences that have between about 70% and about 80%, or more preferably between 

15 about 81% and about 90%, or even more preferably between about 91% and about 99% 
amino acid sequence identity or functional equivalence to the amino acids of 
SEQIDNOrlO, SEQ ID NO: 12, SEQ ID NO: 14, SEQIDNO:26, SEQIDNO:28, 
SEQ ID NO:30, or SEQ ID NO:34 will be sequences that are "essentially as set forth in 
SEQIDNOilO, SEQIDN0:12, SEQIDNO:14, SEQIDNO:26, SEQIDNO:28, 

20 SEQ ID NO:30, or SEQ ID NO:34." 

It will also be understood that amino acid and nucleic acid sequences may include 
additional residues, such as additional N- or C-terminal amino acids or 5' or 3' sequences, 
and yet still be essentially as set forth in one of the sequences disclosed herein, so long as 
the sequence meets the criteria set fortii above, including the maintenance of biological 

25 protein activity where protein expression is concerned. The addition of terminal 
sequences particularly applies to nucleic acid sequences that may, for example, include 
various non-coding sequences flanking either of the 5' or 3' portions of the coding region 
or may include various internal sequences, /.e., introns, v^ch are known to occur within 
genes. 
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The nucleic acid segments of the present invention, regardless of the length of the 
coding sequence itself, may be combined with other DNA sequences, such as promoters, 
polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other 
coding segments, and the like, such that their overall length may vary considerably. It is 
5 therefore contemplated that a nucleic acid fragment of almost any length may be 
employed, with the total length preferably being limited by the ease of preparation and 
use in the intended recombinant DNA protocol. For example, nucleic acid fragments 
may be prepared that include a short contiguous stretch encoding either of the peptide 
sequences disclosed in SEQ ID NO:10, SEQ ID JijO:12, SEQ ID NO:14, SEQ ID NO:26, 

10 SEQIDNO:28, SEQIDNO:30, or SEQIDNO:34, or that are identical to or 
complementary to DNA sequences vAnch encode any of the peptides disclosed in 
SEQ ID NO: 10, SEQ ID NO: 12 SEQ ID NO: 14, SEQ ID NO:26, SEQ ID NO:28, 
SEQ ID NO:30, or SEQ ID NO:34, and particularly those DNA segments disclosed in 
SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID N0:13, SEQ ID NO:25, SEQ ID NO:27, 

15 SEQIDNO:29, or SEQIDNO:33. For example, DNA sequences such as about 14 
nucleotides, and that are up to about 10,000, about 5,000, about 3,000, about 2,000, about 
1,000, about 500, about 200, about 100, about 50, and about 14 base pairs in length 
(including all intermediate lengths) are also contemplated to be useful. 

It will be readily understood that "intermediate lengths", in these contexts, means 

20 any length between the quoted ranges, such as 14, 15, 16, 17, 18, 19, 20, etc.; 21, 22, 23, 
etc.; 30, 31, 32, eta; 50, 51, 52, 53, etc; 100, 101, 102, 103, etc; 150, 151, 152, 153, etc; 
including all integers through the 200-500; 500-1,000; 1,000-2,000; 2,000-3,000; 3,000- 
5,000; and up to and including sequences of about 10,000 nucleotides and the like. 

It will also be understood that this invention is not limited to the particular nucleic 

25 acid sequences which encode peptides of the present invention, or which encode the 
amino acid sequences of SEQIDNOrlO, SEQIDNO:12, SEQIDN0:14, 
SEQIDNO:26, SEQIDNO:28, SEQ ID NO:30, or SEQIDNO:34, including those 
DNA sequences which are particularly disclosed in SEQIDN0:9, SEQ ID NO: 11 
SEQ ID NO: 13, SEQIDNO:25, SEQIDNO:27, SEQIDNO:29, or SEQIDNO:33. 

30 Recombinant vectors and isolated DNA segments may therefore variously include the 
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peptide-coding regions themselves, coding regions bearing selected alterations or 
modifications in the basic coding region, or they may encode larger polypeptides that 
nevertheless include these peptide-coding regions or may encode biologically functional 
equivalent proteins or peptides that have variant amino acids sequences. 

5 The DNA segments of the present invention encompass biologically-fiinctionai, 

equivalent peptides. Such sequences may arise as a consequence of codon redundancy 
and functional equivalency that are known to occur naturally within nucleic acid 
sequences and the proteins thus encoded. Alternatively, functionally-equivalent proteins 
or peptide may be created via the application of recombinant DNA technology, in which 

10 changes in the protein structure may be engineered, based on considerations of the 
properties of the amino acids being exchanged. Changes designed by man may be 
introduced through the application of site*directed mutagenesis techniques, e,g., to 
introduce improvements to the antigenicity of the protein or to test mutants in order to 
. examine activity at the molecular level. 

15 If desired, one may also prepare fusion proteins and peptides, e.g., where the 

peptide-coding regions are aligned within the same expression unit with other proteins or 
peptides having desired functions, such as for purification or immunodetection purposes 
proteins that may be purified by affinity chromatogr^hy and enzyme label coding 
regions, respectively). 

20 Recombinant vectors form fiarther aspects of the present invention. Particularly 

useful vectors are contemplated to be those vectors in which the coding portion of the 
DNA segment, whether encoding a fiill length protein or smaller peptide. Is positioned 
under the control of a promoter. The promoter may be in the form of the promoter that is 
natuiaUy associated with a gene encoding peptides of the present invention, as may be 

25 obtained by isolating the 5' non-coding sequences located upstream of the coding 
segment or exon, for example, using recombinant cloning and/or PGR™ technology, in 
connection with the compositions disclosed herein. 
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2.7 Recombinant Vectors and Protein Expression 

In other embodiments, it is contemplated that certain advantages will be gained by 
positioning the coding DNA segment under the control of a recombinant, or 
heterologous, promoter. As used herein, a recombinant or heterologous promoter is 

5 intended to refer to a promoter that is not nomially associated with a DNA segment 
encoding a crystal protein or peptide in its natural environment. Such promoters may 
include promoters normally associated with other genes, and/or promoters isolated from 
any bacterial, viral, eukaryotic, or plant cell. Naturally, it will be important to employ a 
promoter that effectively directs the expression of the DNA segment in :the cell type, 

10 organism, or even animal, chosen for expression. The use of promoter and cell type 
combinations for protein expression is generally known to those of skill in the art of 
molecular biology, for example, see Sambrook et a/., 1989. The promoters employed 
may be constitutive, or inducible, and can be used under the appropriate conditions to 
direct high level expression of the introduced DNA segment, such as is advantageous in 

15 the large-scale production of recombinant proteins or peptides. Appropriate promoter 
systems contemplated for use in high-level expression include, but are not limited to, the 
Pichia expression vector system (Pharmacia LKB Biotechnology). 

In connection with expression embodiments to prepare recombinant proteins and 
peptides, it is contemplated that longer DNA segments will most often be used, with 

20 DNA segments encoding the entire peptide sequence being most preferred. However, it 
will be appreciated that the use of shorter DNA segments to direct the expression of 
crystal peptides or epitopic core regions, such as may be used to generate' anti-crystal 
protein antibodies, also falls within the scope of the invention. DNA segments that 
encode peptide antigens from about 8 to about SO amino acids in length, or more 

25 preferably, from about 8 to about 30 amino acids in length, or even more preferably, from 
about 8 to about 20 amino acids in length are contemplated to be particularly useful 
Such peptide epitopes may be amino acid sequences which comprise contiguous amino 
acid sequences from SEQIDNOilO, SEQIDN0:12 SEQIDNO:14, SEQIDNO:26. 
SEQ IDNO:28, SEQ IDNO:30, or SEQ IDNO:34; or any peptide epitope encoded by 
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the nucleic acid sequences of SEQIDNO:9, SEQIDNOrll, SEQIDN0:13, 
SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:33. 

Methods for the recombinant expression of crystal proteins and vectors useful in 
the expression of DNA constructs encoding crystal proteins are described in Intl. Pat. 
5 AppL Publ. No. WO 95/02058, specifically incorporated herein by reference. 



2.8 Recombinant Host Cells 

Table 2 
Strains Deposited WITH NRRL 



Strain 


Plasmid 


Accession Number 


Deposit Date 


EG 11063 


pEG1068 


B-21579 


June 26, 1996 


EG 11 074 


pEG1077 


B-21580 


June 26, 1996 


EG11091 


pEG1092 


B-21780 


May XX, 1997 


EG11092 


pEG1093 


B-2163S 


November 14, 1996 


EGil735 


pEG36S 


B-21581 


June 26, 1996 


EG11751 


pEG378 


B-21636 


November 14, 1996 


EG11768 


pEG381 


B-2178I 


May XX, 1997 



10 * 

2.9 DNA Segments as Hybridization Probes and Primers 

In addition to their use in directing the expression of crystal proteins or peptides 
of the present invention, the nucleic acid sequences contemplated herein also have a 
variety of other uses. For example, they also have utility as probes or primers in nucleic 

IS acid hybridization embodiments. As such, it is contemplated that nucleic acid segments 
that comprise a sequence region that consists of at least a 14 nucleotide long contiguous 
sequence that has tiie same sequence as, or is complementary to, a 14 nucleotide long 
contiguous DNA segment of SEQIDN0:9, SEQ ID NO: 11, SEQ ID NO: 13, 
SEQIDNO:25, SEQroNO:27, SEQIDNO:29, or SEQIDNO:33 will find particular 

20 utility. Also, nucleic acid segments which encode at least a 6 amino acid contiguous 
sequence from SEQIDNO:10, SEQIDNO:12, SEQIDN0:14, SEQIDNO:26, 
SEQIDNO:28, SEQIDNO:30, or SEQIDNO:34, are also preferred. Longer 
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contiguous identical or complementary sequences, e.g., those of about 20, 30, 40, SO, 
100, 200, SOO, 1000, 2000, 5000, 10000 etc, (including all intermediate lengths and up to 
and including full-length sequences will also be of use in certain embodiments. 

The ability of such nucleic acid probes to specifically hybridize to crystal protein- 
S encoding sequences will enable them to be of use in detecting the presence of 
complementary sequences in a given sample. However, other uses are envisioned, 
including the use of the sequence information for the preparation of mutant species 
primers, or primers for use in preparing other genetic constructions. 

Nucleic acid molecules having sequerlce regions consisting of contiguous 

10 nucleotide stretches of 10-14, 15-20, 30, 50, or even of 100-200 nucleotides or so, 
identical or complementary to DNA sequences of SEQIDNO:9, SEQIDNO:ll, 
SEQ ID NO: 1 3, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:33, are 
particularly contemplated as hybridization probes for use in, e.g.. Southern and Northern 
blotting. Smaller Augments will generally find use in hybridization embodiments, 

15 wherein the length of the contiguous complementary region may be varied, such as 
between about 10-14 and about 100 or 200 nucleotides, but larger contiguous 
complementarity stretches may be used, according to the length complementary 
sequences one wishes to detect. 

Of course, fragments may also be obtained by other techniques such as, e.g., by 

20 mechanicad shearing or by restriction enzyme digestion. Small nucleic acid segments or 
fragments may be readily prepared by, for example, direcdy synthesizing the fragment by 
chemical means, as is conunonly practiced using an automated oligonucleotide 
synthesizer. Also, fragments may be obtained by application of nucleic acid reproduction 
technology, such as the PCR™ technology of U. S. Patents 4,683,195 and 4,683,202 

25 (each specifically incorporated herein by reference), by introducing selected sequences 
into recombinant vectors for recombinant production, and by other recombinant DNA 
techniques generally known to those of skill in the art of molecular biology. 

Accordingly, the nucleotide sequences of the invention may be used for their 
ability to selectively form duplex molecules with complementary stretches of DNA 

30 fragments. Depending on the application envisioned, one will desire to employ varying 
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conditions of hybridization to achieve varying degrees of selectivity of probe towards 
target sequence. For applications requiring high selectivity, one will typically desire to 
employ relatively stringent conditions to form the hybrids, e.g., one will select relatively 
low salt and/or high temperature conditions, such as provided by about 0.02 M to about 

5 0.15 M NaCl at temperatures of about 50^C to about 70**C. Such selective conditions 
tolerate little, if any, mismatch between the probe and the template or target strand, and 
would be particularly suitable for isolating crystal protein-encoding DNA segments. 
Detection of DNA segments via hybridization is well-known to those of skill in the art, 
and the teachings of U. S. Patents 4,965,1^8 and S,17d,99S (each specifically 

10 incorporated herein by reference) are exemplary of the methods of hybridization analyses. 
Teachings such as those found in the texts of Maloy et al.j 1994; Segal 1976; Prokop, 
1991; and Kuby, 1994, are particularly relevant. 

Of course, for some applications, for example, where one desires to prepare 
mutants employing a mutant primer strand hybridized to an underlying template or where 

IS one seeks to isolate crystal protein-encoding sequences from related species, functional 
equivalents, or the like, less stringent hybridization conditions will typically be needed in 
order to allow formation of the heteroduplex. In these circumstances, one may desire to 
employ conditions such as about O.IS M to about 0.9 M salt, at temperatures ranging 
from about lO^'C to about SS^'C. Cross-hybridizing species can thereby be readily 

20 identified as positively hybridizing signals with respect to control hybridizations. In any 
case, it is generally appreciated that conditions can be rendered more stringent by the 
addition of increasing amounts of formamide, which serves to destabilize the hybrid 
duplex in the same manner as increased temperature. Thus, hybridization conditions can 
be readily manipulated, and thus will generally be a method of choice depending on the 

25 desired results. 

In certain embodiments, it will be advantageous to employ nucleic acid sequences 
of the present invention in combination with an appropriate means, such as a label, for 
determining hybridization. A wide variety of appropriate indicator means are known in 
the art, including fluorescent, radioactive, enzymatic or other ligands, such as 

30 avidin^iot^n, which are capable of giving a detectable signal. In preferred embodiments, 
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one will likely desire to employ a Huorescent label or an enzyme tag, such as urease, 
alkaline phosphatase or peroxidase, instead of radioactive or other environmental 
undesirable reagents. In the case of enzyme tags, colorimetric indicator substrates are 
known that can be employed to provide a means visible to the human eye or 

5 spectrophotomeuically, to identify specific hybridization with complementary nucleic 
acid-containing samples. 

In general, it is- envisioned that the hybridization probes described herein will be 
useful both as reagents in solution hybridization as well as in embodiments employing a 
solid phase. In embodiments involving a solid phaSe, the test DNA (or RNA) is adsorbed 

10 or otherwise affixed to a selected matrix or surface. This fixed, single-stranded nucleic 
acid is then subjected to specific hybridization with selected probes under desired 
conditions. The selected conditions will depend on the particular circumstances based on 
the particular criteria required (depending, for example, on the G+C content, type of 
target nucleic acid, source of nucleic acid, size of hybridization probe, etc.). Following 

15 washing of the hybridized surface so as to remove nonspecifically bound probe 
molecules, specific hybridization is detected, or even quantitated, by means of the label. 

2.10 Biological Functional Equivalents 

Modification and changes may be made in the structure of the peptides of the 

20 present invention and DNA segments which encode them and still obtain a fimctional 
molecule that encodes a protein or peptide with desirable characteristics. The foUowmg 
is a discussion based upon changing the amino acids of a protein to create an equivalent, 
or even an improved, second-generation molecxile. In particular embodiments of the 
invention, mutated crystal proteins are contemplated to be usefiil for increasing the 

25 insecticidal activity of the protein, and consequently increasing the insecticidal activity 
and/or expression of the recombinant transgene in a plant cell. The amino acid changes 
may be achieved by changing the codons of the DNA sequence, according to the codons 
given in Table 3. 
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Table 3 



Amino Acid Codons 



Alanine 


Ala 


A 


OCA 


GCC 


GCG 


GCU 


Cysteine 


Cys 


C 


UGC 


UGU 






Aspartic acid 


Asp 


D 


GAC 


GAU 






Glutamic acid 


Glu 


E 


GAA 


GAG 






Phenylalanine 


Phe 


F 


uuc 


UUU 






Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGU 


Histidine 


His 


H 


CAC 


CAU 






Isoleucine 


He 


I 


AUA 


AUC 


AUU 




Lysine 


Lys 


K 


AAA 


AAG 






Leucine 


Leu 


L 


UUA 


UUG 


CUA 


cue 


Methionine 


Met 


M 


AUG 








Asparagine 


Asn 


N 


AAC 


A ATT 

AAU 






Proline 


Pro 


P 


CCA 


CCC 


CCG 


ecu 


Glutamine 


Oin 


Q 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


Serine 


Ser 


S 


AGC 


AGU 


UCA 


UCC 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACU 


Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 


Tryptophan 


Trp 


W 


UGG 








Tyrosine 


Tyr 


Y 


UAC 


UAU 







cue CUG CUU 



For example, certain amino acids may be substituted for other amino acids in a 
protein structure without appreciable loss of interactive binding c24}acity with structures 
such as, for example, antigen-binding regions of antibodies or binding sites on substrate 
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molecules. Since it is the interactive capacity and nature of a protein that defines that 
protein's biological functional activity, certain amino acid sequence substitutions can be 
made in a protein sequence, and, of course, its underlying DNA coding sequence, and 
nevertheless obtain a protein with like properties. It is thus contemplated by the inventors 
that various changes may be made in the peptide sequences of the disclosed 
compositions, or corresponding DNA sequences which encode said peptides without 
appreciable loss of their biological utility or activity. 

In making such changes, the hydropathic index of amino acids may be considered. 
The inipbrtance of the hydropathic amino acid index in conferring interactive biologic 
function on a protein is generally understood in the art (Kyte and Doolittle, 1982, 
incorporate herein by reference). It is accepted that the relative hydropathic character of 
the amino acid contributes to the secondary structure of the resultant protein, which in 
turn defines the interaction of the protein with other molecules, for example, enzymes, 
substrates, receptors, DNA, antibodies, antigens, and the like. 

Each amino acid has been assigned a hydropathic index on the basis of their 
hydrophobicity and charge characteristics (Kyte and Doolittle, 1982), these are: 
isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine 
(+2.5); methionine (+1.9); alanine (+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); 
tryptophan (-0.9); tyrosine (-1.3); proline (-1.6); histidine (-3.2); glutamate (-3.5); 
glutamine (-3.5); aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and arginine (-4.5). 

It is known in the art that certain amino acids may be substituted by other amino 
acids having a similar hydropathic index or score and still result in a protein with similar 
biological activity, Le., still obtain a biological functionally equivalent protein. In 
making such changes, the substitution of amino acids whose hydropathic indices are 
within ±2 is preferred, those which are within ±1 are particularly preferred, and those 
within ±0.5 are even more particularly preferred. 

It is also understood in the art that the substitution of like amino acids can be 
made effectively on the basis of hydrophilicity. U. S. Patent 4,554,101, incorporated 
herein by reference, states that the greatest local average hydrophilicity of a protein, as 
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governed by the hydrophilicity of its adjacent amino acids, correlates with a biological 
property of the protein. 

As detailed in U. S. Patent 4,554,101, the following hydrophilicity values have 
been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 ± 1); 
5 glutamate (+3.0 ± 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); 
threonine (-0.4); proline (-0.5 ± 1); alanine (-0.5); histidine (-0.5); cysteine (-1.0); 
methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); 
phenylalanine (-2.5); tryptophan (-3.4). 
^ It is understood that an amino acid caii be substituted for another having a similar 

10 hydrophilicity value and still obtain a biologically equivalent, and in particular, an 
inununologically eqmvalent protein. In such changes, the substitution of amino acids 
whose hydrophilicity values are within ±2 is preferred, those which are within ±1 are 
particularly preferred, and those within ±0.5 are even more particularly preferred. 

As outlined above, amino acid substitutions are generally therefore based on the 
15 relative similarity of the amino acid side-chain substituents, for example, their 
hydrophobicity, hydrophilicity, charge, size, and the like. Exemplary substitutions which 
take various of the foregoing characteristics into consideration are well known to those of 
skill in the art and include: arginine and lysine; glutamate and aspartate; serine and 
threonine; glutamine and asparagine; and valine, leucine and isoleucine. 

20 

2.1 1 SiTE-SPECinc Mutagenesis 

Site-specific mutagenesis is a technique useful in the preparation of individual 
peptides, or biologically functional equivalent proteins or peptides, through specific 
mutagenesis of the underlying DNA. The technique further provides a ready ability to 

25 prepare and test sequence variants, for example, incorporating one or more of the 
foregoing considerations, by introducing one or more nucleotide sequence changes into 
the DNA. Site-specific mutagenesis allows the production of mutants through the use of 
specific oligonucleotide sequences which encode the DNA sequence of the desired 
mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer 

30 sequence of sufiBcient size and sequence complexity to form a stable duplex on both sides 

-35- 

A. IOS7n(20M80M DOC) 



of the deletion junction being traversed. Typically, a primer of about 17 to 25 nucleotides 
in length is preferred, with about S to 10 residues on both sides of the junction of the 
sequence being altered. 

In general, the technique of site-specific mutagenesis is well known in the art, as 
5 exemplified by various publications. As will be appreciated, the technique typically 
employs a phage vector which exists in both a single stranded and double stranded form. 
Typical vectors useful in site-directed mutagenesis include vectors such as the Ml 3 
phage. These phage are readily conunercially available and their use is generally well 
known to diose skilled in the art: Double strands plasmids are also routinely employed 

10 in site directed mutagenesis which eliminates the step of transferring the gene of interest 
from a plasmid to a phage. 

In general, site-directed mutagenesis in accordance herewith is performed by first 
obtaining a single-stranded vector or melting apart of two strands of a double stranded 
vector which includes within its sequence a DNA sequence which encodes the desired 

IS peptide. An oligonucleotide primer bearing the desired mutated sequence is prepared^ 
generally synthetically. This primer is then annealed vsrith the single-stranded vector, and 
subjected to DNA polymerizing enzymes such as E. coli polymerase I Klenow fragment, 
in order to complete the synthesis of the mutation-bearing strand. Thus, a heteroduplex is 
formed wherein one strand encodes the original non-mutated sequence and the second 

20 strand bears the desired mutation. This heteroduplex vector is then used to transform 
appropriate cells, such as K coli cells, and clones are selected which include recombinant 
vectors bearing the mutated sequence arrangement 

The preparation of sequence variants of the selected peptide-encoding DNA 
segments using site-directed mutagenesis is provided as a means of producing potentially 

25 useful species and is not meant to be limiting as there are other ways in which sequence 
variants of peptides and the DNA sequences encoding them may be obtained. For 
example, recombinant vectors encoding the desired p>eptide sequence may be treated with 
mutagenic agents, such as hydroxylamine, to obtain sequence variants. 
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2.12 Crystal Protein Compositions As Insecticides and Methods of Use 

The inventors contemplate that the chimeric crystal protein compositions 
disclosed herein will find particular utility as insecticides for topical and/or systemic 
application to field crops, grasses, fruits and vegetables, and ornamental plants. In a 

5 preferred embodiment, the bioinsecticide composition comprises an oil flowable 
suspension of bacterial cells which expresses a novel crystal protein disclosed herein. 
Preferably the cells are A thvringiensis cells, however, any such bacterial host cell 
expressing the novel nucleic acid segments disclosed herein and producing a crystal 
protein is contemplated to be useful, such as. B: megaterium^ B. subtilis, E. coli, or 

10 Pseudomonas spp. 

In another important embodiment, the bioinsecticide composition comprises a 
water dispersible granule. This granule comprises bacterial cells which expresses a novel 
crystal protein disclosed herein. Preferred bacterial cells are 5. thuringiensis cells, 
however, bacteria such as A megaterium, B. subtilis, E. coli, or Pseudomonas spp. cells 

15 transformed with a DNA segment disclosed herein and expressmg the crystal protein are' 
also contemplated to be useful. 

In a third important embodiment, the bioinsecticide composition comprises a 
wettable powder, dust, pellet, or collodial concentrate. This powder comprises bacterial 
cells which expresses a novel crystal protein disclosed herein. Preferred bacterial cells 

20 are B. thuringiensis cells, however, bacteria such as B. megaterium, B. subtilis, E. coli, or 
Pseudomonas spp. cells transformed with a DNA segment disclosed herein and 
expressing the crystal protein are also contemplated to be useful. Such dry forms of the 
insecticidal compositions may be formulated to dissolve inunediately upon wetting, or 
alternatively, dissolve in a controlled-release, sustained-release, or other time-dependent 

25 manner. 

In a fourth important embodiment, the bioinsecticide composition comprises an 
aqueous suspension of bacterial cells such as those described above which express the 
crystal protein. Such aqueous suspensions may be provided as a concentrated stock 
solution which is diluted prior to application, or alternatively, as a diluted solution ready- 
30 to-apply. 
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For these methods involving application of bacterial cells, the cellular host 
containing the crystal protein gene(s) may be grown in any convenient nutrient medium, 
where the DNA construct provides a selective advantage, providing for a selective 
medium so that substantially all or all of the cells retain the B. thuringiensis gene. These 
cells may then be harvested in accordance with conventional ways. Alternatively, the 
cells can be treated prior to harvesting. 

When the insecticidal compositions comprise intact B. thuringiensis cells 
expressing the protein of interest, such bacteria may be formulated in a variety of ways. 
They may be employed as wettable powders, gdiniiles or dusts, by mixing with various 
inert materials, such as inorganic minerals (phyllosilicates, carbonates, sulfates, 
phosphates, and the like) or botanical materials (powdered corncobs, rice hulls, wahiut 
shells, and the hke). The formulations may include spreader-sticker adjuvants, stabilizing 
agents, other pesticidal additives, or surfactants. Liquid formulations may be aqueous- 
based or non-aqueous and employed as foams, suspensions, emulsifiable concentrates, or 
the like. The ingredients may include riieological agents, sur&ctants, emulsifiers, 
dispersants, or polymers. 

Alternatively, the novel chimeric Cry proteins may be prepared by recombinant 
bacterial expression systems in vitro and isolated for subsequent field application. Such 
protein may be either in crude cell lysates, suspensions, colloids, etc., or alternatively 
may be purified, refined, buffered, and/or fiirther processed, before formulating in an 
active biocidal formulation. Likewise, under certain circumstances, it may be desirable to 
isolate crystals and/or spores from bacterial cultures e^qiressing the crystal protein and 
^ly solutions, suq)ensions, or collodial preparations of such crystals and/or spores as 
the active bioinsecticidal composition. 

Regardless of the method of application, the amount of the active component(s) 
are applied at an insecticidally-effective amount, which will vary depending on such 
factors as, for example, the specific coleopteran insects to be controlled, the specific plant 
or crop to be treated, the environmental conditions, and the method, rate, and quantity of 
application of the insecticidally-active composition. 
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The insecticide compositions described may be made by formulating eitiier the 
bacterial cell, crystal and/or spore suspension, or isolated protein component witfi the 
desired agriculturally-acceptable carrier. The compositions may be formulated prior to 
administration in an appropriate means such as lyophilized, freeze-dried, dessicated, or in 

5 an aqueous carrier, medium or suitable diluent, such as saline or other buffer. The 
formulated compositions may be in the form of a dust or granular material, or a 
suspension m oil (vegetable or mineral), or water or oil/water emulsions, or as a wettable 
powder, or in combination with any other carrier material suitable for agricultural 
application. Suitable agricultural carriers can be solid or liquid and are Avell known in the 

10 art. The term "agriculturally-acceptable canrier" covers all adjuvants, e.g., inert 
components, dispersants, surfactants, tackifiers, binders, etc. that are ordinarily used in 
insecticide formulation technology; tiiese are well known to those skilled in insecticide 
formulation. The formulations may be mixed witii one or more solid or liquid adjuvants 
and prepared by various means, e.g., by homogeneously mixing, blending and/or grinding 

15 tiie insecticidal composition with suitable adjuvants using conventional formulation 
techniques. 

The insecticidal compositions of this invention are applied to the enviroimient of 
the target coleopteran insect, typically onto the foliage of the plant or crop to be 
protected, by conventional methods, preferably by spraying. The strength and duration of 

20 insecticidal application will be set with regard to conditions specific to the particular 
pest(s), crop(s) to be treated and particular environmental conditions. The proportional 
ratio of active ingredient to carrier will naturally depend on the chemical nature, 
solubility, and stability of the insecticidal composition, as well as tiie particular 
formulation contemplated. 

25 Other application techniques, e.g., dusting, sprinkling, soakmg, soil injection, 

seed coating, seedling coating, spraying, aerating, misting, atomizing, and the like, are 
also feasible and may be required under certain circumstances such as e.g., insects that 
cause root or stalk infestation, or for application to delicate vegetation or ornamental 
plants. These application procedures are also well-known to those of skill in the art. 
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The insecticidal composition of the invention may be employed in the method of 
the invention singly or in combination with other compounds, including and not limited 
to other pesticides. The method of the invention may also be used in conjunction with 
other treatments such as surfactants, detergents, polymers or time-release formulations. 

5 The insecticidal compositions of the present invention may be formulated for either 
systemic or topical use. 

The concentration of insecticidal composition which is used for environmental, 
systemic, or foliar application will vary widely depending upon the nature of the 
particular formulation, means of application, environmental conditions, and degree of 

10 biocidal activity. Typically, the bioinsecticidal composition will be present in the applied 
fonnulation at a concentration of at least about 0.5% by weight and may be up to and 
including about 99% by weight. Dry formulations of the compositions may be from 
about 0.5% to about 99% or more by weight of the composition, while liquid 
formulations may generally comprise from about 0.5% to about 99% or more of the 

15 active ingredient by weight. Formulations which comprise intact bacterial cells will 
generally contain from about 10* to about 10*^ cells/mg. 

The insecticidal formulation may be administered to a particular plant or target 
area in one or more applications as needed, with a typical field application rate per 
hectare ranging on the order of from about 50 g to about 500 g of active ingredient, or of 

20 from about 500 g to about 1000 g, or of from about 1000 g to about 5000 g or more of 
active ingredient. 

2.13 ANTIBODY Compositions and Methods for Producing 

In particular embodiments, the inventors contemplate the use of antibodies, either 

25 monoclonal or polyclonal which bind to the crystal proteins disclosed herein. Means for 
preparing and characterizing antibodies are well known in the art (See, e.g., Harlow and 
Lane, 1988; incorporated herein by reference). The methods for generating monoclonal 
antibodies (mAbs) generally begin along the same lines as those for preparing polyclonal 
antibodies. Briefly, a polyclonal antibody is prepared by immunizing an animal with an 

30 immunogenic composition in accordance with the present invention and collecting 
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antisera from that immunized animal. A wide range of animal species can be used for the 
production of antisera. Typically the animal used for production of anti-antisera is a 
rabbit, a mouse, a rat, a hamster, a guinea pig or a goat. Because of the relatively large 
blood volume of rabbits, a rabbit is a preferred choice for production of polyclonal 
antibodies. 

As is well known in die art, a given composition may vary in its immunogenicity. 
It is often necessary therefore to boost the host immune system, as may be achieved by 
coupling a peptide or polypeptide immunogen to a canier. Exemplary and preferred 
carriers are keyhole limpet hemocyanin (KLH) and bovine serum albumin (BSA). Other 
albumins such as ovalbumin, mouse serum albumin or rabbit serum albumin can also be 
used as carriers. Means for conjugating a polypeptide to a carrier protein are well known 
in the art and include glutaraldehyde, m-maleimidobencoyl-AT-hydroxysuccinimide ester, 
carbodiimide and bis-biazotized benzidine. 

As is also well known in the art, the immunogenicity of a particular immunogen 
composition can be enhanced by the use of non-specific stimulators of the immune 
response, known as adjuvants. Exemplary and preferred adjuvants include complete 
Freund's adjuvant (a non-specific stimulator of the immune response containing killed 
Mycobacterium tuberculosis), incomplete Freund's adjuvants and aluminum hydroxide 
adjuvant. 

The amount of immunogen composition used in the production of polyclonal 
antibodies varies upon the nature of the immunogen as well as the animal used for 
immunizaticm. A variety of routes can be used to administer the immunogen 
(subcutaneous, intnunuscular, intradermal, intravenous and intraperitoneal). The 
production of polyclonal antibodies may be monitored by sampling blood of the 
immunized animal at various points following i mmun ization. A second, booster, 
injection may also be given. The process of boosting and titering is repeated until a 
suitable titer is achieved. When a desired level of immunogenicity is obtained, the 
immunized anunal can be bled and the serum isolated and stored, and/or the animal can 
be used to generate mAbs. 
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mAbs may be readily prepared through use of well-known techniques, such as 
those exemplified in U. S. Patent 4,196,265 (specifically incorporated herein by 
reference). Typically, this technique involves immunizing a suitable animal with a 
selected immunogen composition, e.g.. a purified or partially purified crystal protein, 

5 polypeptide or peptide. The immunizing composition is administered in a manner 
effective to stimulate antibody producing cells. Rodents such as mice and rats are 
preferred animals, however, the use of rabbit, sheep frog cells is also possible. The use of 
rats may provide certain advantages (Coding, 1986, pp. 60-61), but mice are preferred, 
with Ae BALB/c motise being most preferred as this is most routinely used and generally 

1 0 gives a higher percentage of stable fiisions. 

Following immunization, somatic cells with the potential for producing 
antibodies, specifically B lymphocytes (B cells), are selected for use in the mAb 
generating protocol. These cells may be obtained from biopsied spleens, tonsils or lymph 
nodes, or from a peripheral blood sample. Spleen cells and peripheral blood cells are 

15 preferred, the former because they are a rich source of antibody-producing cells thai are 
in the dividing plasmablast stage, and the latter because peripheral blood is easily 
accessible. Often, a panel of animals will have been immunized and the spleen of animal 
with the highest antibody titer will be removed and the spleen lymphocytes obtained by 
homogenizing the spleen with a syringe. Typically, a spleen from an inununized mouse 

7 8 

20 contains approximately 5x10 to 2 x 10 lymphocytes. 

The antibody-producing B lymphocytes fmm the immunized ianimal are then 
fiised with cells of an inunortal myeloma cell, generally one of the same species as the 
animal that was immunized. Myeloma cell lines suited for use in hybridoma-producing 
fiision procedures preferably are non-antibody-ptoducing, have high fiision efficiency, 

25 and enzyme deficiencies that render then incapable of growing in certain selective media 
which support the growth of only the desired fiised cells Oiybridomas). 

Any one of a number of myeloma cells may be used, as are known to those of 
skill in the art (Coding, pp. 65-66, 1986; Campbell, pp. 75-83, 1984). For example, 
where the immunized animal is a mouse, one may use P3-X63/Ag8, X63-Ag8.653, 

30 NSl/l.Ag 4 1, Sp2I0-Agl4, FO. NSO/U, MPC-U. MPC11-X45-GTG 1.7 and 

-42- 

A l«TI«(»MBOII OOO 



i 



- \ 



S194/5XX0 Bui; for rats, one may use R210.RCY3, Y3-Ag 1.2.3. IR983F and 4B210; 
and U-266, GM1500-GRG2, LICR-L0N-HMy2 and UC729-6 are all useful in 
connection with human cell fusions. 

One preferred murine myeloma ceil is the NS-1 myeloma cell line (also termed 
P3-NS-l-Ag4-l), v^ch is readily available firom the NIGMS Human Genetic Mutant 
Cell Repository by requesting cell line repository number GM3573. Another mouse 
myeloma cell line thstf may be used is the 8-azj^uanine-resistant mouse murine myeloma 
SP2/0 non-producer cell line. 

Methods for generating hybrids of antibody-producing ^lewi or lymph node cells 
and myeloma cells usually comprise mixing somatic cells with myeloma cells in a 2:1 
ratio, though the ratio may vary from about 20:1 to about 1:1, respectively, in the 
presence of an agent or agents (chemical or electrical) that promote the fusion of cell 
membranes. Fusion methods using Sendai virus have been described (Kohler and 
Milstein, 1975; 1976), and those using polyethylene glycol (PEG), such as 37% 
(volTvol.) PEG. (Gefter et ai, 1977). The use of electrically induced fusion methods is 
also appropriate (Goding, 1986, pp. 71-74). 

Fusion procedures usually produce viable hybrids at low frequencies, about 
1 X 10"* to 1 X 10''. However, this does not pose a problem, as the viable, fused hybrids 
are differentiated from the parental, unfiised cells (particularly the unfused myeloma cells 
that would normally continue to divide indefinitely) by culturing in a selective medium. 
The selective medium is generally one that contains an agent that blocks the de novo 
synthesis of nucleotides in the tissue culture media. Exemplary and preferred agents are 
aminoptBrin, methoti»cate. and azaserine. Aminopterin and methotrexate block de novo 
^ntbesis of both purines and pyrimidines. v^ereas azaserine blocks only purine 
sytiOxtm. Where aminopterin or methotrexate is used, the media is supplemented with 
hypoxanthine and thymidine as a source of nucleotides (HAT medium). Where azaserine 
is used, the media is supplemented with hypoxanthine. 

The preferred selection medium is HAT. Only cells capable of operating 
nucleotide salvage pathways are able to survive in HAT medium.. The myeloma cells are 
defective in key enzymes of the salvage pathway, e.g., hypoxanthine phosphoribosyl 
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transferase (HPRT), and they cannot survive. The B-cells can operate this pathway, but 
they have a limited life span in culture and generally die within about two weeks. 
Therefore, the only cells that can survive in the selective media are those hybrids formed 
from myeloma and B-cells. 

This culturing provides a population of hybridomas from which specific 
hybridomas are selected. Typically, selection of hybridomas is performed by culturing 
the cells by single-clone dilution in microtiter plates, followed by testing the individual 
clonal supematants (after about two to three weeks) for the desired reactivity. The assay 
should be sensitive, simple and rapid, such as radioinimunoassays, enzyine 
immunoassays, cytotoxicity assays, plaque assays, dot inmiunobinding assays, and the 
like. 

The selected hybridomas would then be serially diluted and cloned into individual 
antibody-producing cell lines, which clones can then be propagated indefinitely to 
provide mAbs. The cell lines may be exploited for mAb production in two basic ways. 
A sample of the hybridoma can be injected (often into the peritoneal cavity) into ^ 
histocompatible animal of the type that was used to provide the somatic and myeloma 
cells for the original fusion. The injected animal develops tumors secreting the specific 
monoclonal antibody produced by the fused cell hybrid. The body fluids of the animal, 
such as serum or ascites fluid, can then be t^ped to provide mAbs in high concentration. 
The individual cell lines could also be cultured in vitro, where the mAbs are naturally 
secreted into the culture medium from which they can be readily obtained in high 
concentrations. mAbs produced by either means may be fiirther purified, if desired, using 
filtration, centriiugation and various chromatograjriiic methods such as HPLC or affinity 
chromatography. 

3. Brief Description of the Drawings 

The following drawings form part of the present specification and are included to 
fiirther demonstrate certain aspects of the present invention. The invention may be better 
understood by reference to one or more of these drawings in combination with the 
detailed description of specific embodiments presented herein. 
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FIG. 1. The wild-type 5-endotoxins and the relevant restriction sites that 

were used to construct the hybrid 5-endotoxins pertinent to the invention are diagrammed 

in FIG. 1 A. Only the DNA encoding the 5-endotoxin that is contained on the indicated 

plasmid (identified by the "pEG" prefix) is shown. The A thuringiensis strains 
5 containing the indicated plasmids are identified by the "EG" prefix. The hybrid 

5-endotoxins described in the invention are diagrammed in FIG. IB and are aligned with 

the wild-type 5-endotoxins in FIG. 1 A. 

FIG. 2. An equal amount of each washed sporulated A thuringiensis 

culture was analyzed by SDS-PAGE. Lane a: control Cryl Ac producing B. thuringiensis 
10 strain EGn070,b: EG11060,c: EG11062,d: EG11063,e: EG11065.f: EG11067,g: 

EG11071,h: EGll073,i: EG11074,j: EG11088,k: EG11090,andl: EG11091. 

FIG. 3. Solubilized hybrid 5-endotoxins were exposed to trypsin for 0, 15, 

30, 60, and 120 minutes. The resulting material was analyzed by SDS-PAGE. The 

amount of active 5-endotoxin firagment remaining was quantitated by scanning 
1 5 densitometry using a Molecular Dynamics model 300A densitometer. The percent actiVfe 

toxin remaining was plotted versus time. Wild-type CrylAc 5-endotoxin (open box) 

served as the control. 

FIG. 4. Schematic . diagrams of the wild-type toxins and the relevant 

restriction sites that were used to construct the hybrid 5-endotoxin encoded by pEG381 
20 and expressed in EGl 1768. Only the DNA encodmg the 5-endotoxin that is contained on 

the indicated plasmid (identified by the "pEG" prefix) is shown. 

4. Brief Description of the Sequences 
25 SEQ ID NO:l is oligonucleotide primer A. 

SEQ ID NO:2 is oligonucleotide primer B. 
SEQ ID NO:3 is oligonucleotide primer C. 
SEQ ID NO:4 is oligonucleotide primer D, 
SEQ ID NO:5 is oligonucleotide primer E. 
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SEQ ID NO:6 is oligonucleotide primer F. 
SEQ ID NO:7 is oligonucleotide primer G. 
SEQ ID NO:8 is oligonucleotide primer H. 

SEQ ID NO:9 is the nucleotide and deduced amino acid sequences of the 

EGl 1063 hybrid 5-endotoxin. 

SEQ ID NO: 10 denotes in the three-letter abbreviation form, the amino acid 
sequence for the hybrid 6-endotoxin specified in SEQ ID N0:9. 

SEQ ID NO: 11 is the nucleotide and deduced amino acid seqiwnces of the 
EG11074 hybrid 5-endptoxin. 

SEQ ID NO: 12 denotes in the three-letter abbreviation form, the amino acid 
sequence for the hybrid 5-endotoxin specified in SEQ ID NO: 1 1 . 

SEQ ID NO: 13 is the nucleotide and deduced amino acid sequences of the 
EGl 1735 hybrid 5-endotoxin. 

SEQ ID NO: 14 denotes in the three-letter abbreviation form, the amino acid 
sequence for the hybrid 6-cndotoxin specified in SEQ ID NO: 13. 

SEQ ID NO:lS is the 5' exchange site for pEGl065, pEG1070, and pEG1074. 

SEQ ID NO:16 is the 5' exchange site for pEG1067, pEG1072, and pEGl076. 

SEQ ID NO:17 is the 5' exchange site for pEG1068, pEG1077, and pEG365. 

SEQ ID NO:18 is the 5' exchange site for pEGl088 and pEGl092. 

SEQ ID NO:19 is the 5' exchange site for pEG1089 and the 3' exchange site for 
pEG1070 and pEG1072. 

SEQ ID NO:20 is the 5' exchange site for pEGl09l . 

SEQIDNO:21 is the 3' exchange site for pEG1065, pEGl067, pEGl068, 
pEG1093, pEG378, and pEG 365. 

SEQ ID NO:22 is the 3' exchange site for pEG1088. 
SEQ ID NO:23 is oligonucleotide Primer I. 
SEQ ID NO:24 is oligonucleotide Primer J. 

SEQ ID NO:25 is the nucleic acid sequence and deduced amino acid sequence of 
the hybrid crystal protein-encoding gene of EGl 1092. 
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SEQ ID NO:26 is the three-letter abbreviation form of the amino acid sequence 
of the hybrid crystal protein produced by strain EG 1 1 092 encoded by SEQ ID NO:25. 

SEQn)NO:27 is the nucleic acid sequence and the deduced amino acid 
sequence of the hybrid crystal protein-encoding gene of EGl 1751. 
5 SEQ ID NO:28 is the three-letter abbreviation form of the amino acid sequence 

of the hybrid crystal protein produced by strain EGl 1751 encoded bj^^SEQ ID NO:27. 

SEQIDNO:29 is the nucleic acid sequence and the deduced amino acid 
sequence of the hybrid crystal protein-encoding gene of EGl 1091. 

SEQ ID NO:30 is die three-letter abbreviation form of the amino acid sequence 
10 of the hybrid crystal protein produced by strain EGl 1091 encoded by SEQ ID NO:29. 

SEQ ID NO:31 is oUgonucleotide primer K. 

SEQ ID NO:32 is the 5' exchange site for pEG378 and pEG381. 

SEQIDNO:33 is the nucleic acid sequence and the deduced amino acid 
sequence of tiie hybrid crystal protein-encoding gene of EGl 1768. 
15 SEQ ID NO:34 denotes in the three-letter abbreviation form, the amino acid 

sequence of the hybrid crystal protein produced by strain EGl 1768 encoded by 
SEQ1DN0:33. 

SEQ ID NO:35 is the 3' exchange site for pEG1074. pEG1076, pEG1077 and 
pEG381. 

20 

5. Description of Illustrative Embodiments 

5.1 Methods for Culturing B. thurinciensisto Produce Cry Proteins 

The B. thuringiensis strains described herein may be culttired using standard 

known media and fermentation techniques. Upon completion of the fenmentation cycle, 
25 the bacteria may be harvested by first separating the B. thuringiensis spores and crystals 

from the fermentation broth by means well known in the art TTie recovered B. 

thuringiensis spores and crystals can be formulated into a wettable powder, a liquid 

concentrate, granules or other formulations by the addition of surfactants, dispersants, 

inert carriers and other components to facilitate handling and appUcation for particular 
30 target pests. The formulation and application procedures are all well known in die art and 
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are used with commercial strains of B. ihuringiensis (HD-1) active against Lepidoptera, 
e.g., caterpillars. 

5.2 RECOMBINANT HOST CELLS FOR EXPRESSION OF CRY GENES 

The nucleotide sequences of the subject invention can be introduced into a viade 
variety of microbial hosts. Expression of the toxin gene results. direcUy or indirectfy, in 
the intracellular production and maintenance of the pesticide. With suitable hosts, e.g., 
Pseudomonas, the microbes can be applied to the sites of lepidopteran insects where they 
will proliferate and be ingested by the insects. The results is a control of the unwanted 
insects. Alternatively, the microbe hosting the toxin gene can be treated under conditions 
that prolong the activity of the toxin produced in the cell. The treated ceU then can be 
applied to the environment of target pest(s). The resulting product retains the toxicity of 

the B. ttwringiensis toxin. 

Suitable host cells, where the pesticide-containing cells wiU be treated to prolong 
the activity of the toxin in the cell when the then treated cell is applied to the environment 
of target pest(s), may include either prokaryotes or eukaryotes, normally being limited to 
those cells which do not produce substances toxic to higher organisms, such as mammals. 
However, organisms which produce substances toxic to higher organisms could be used, 
where the toxin is unstable or the level of application sufficiently low as to avoid any 
possibility or toxicity to a mammalian host As hosts, of particular interest will be the 
prokaiyotes and the lower eukaryotes, such as fungi. Illustrative prokaryotes, both Gram- 
negative and Gram-positive, include Enterobacteriaceae, such as Escherichia. Erwinia. 
Shigella, Salmonella, and Proteus; Bacillaceae: Rhizobiceae. such as Rhizobium: 
Spirillaceae. such as photobacterium, Zymomonas. Serratia, Aeromonas. Vibrio. 
Desulfovibrio, Spirillum: Lactobacillaceae; Pseudomonadaceae, such as Pseudomonas 
and Acetobacter: Azotobacteraceae. Actinomycetales. and Nitrobacteraceae. Among 
eukaryotes are fimgi, such as Pf^omycetes and Ascomycetes. which includes yeast, such 
as Saccharomyces and Schizosaccharomyces; and Basidiomycetes yeast, such as 
Rhodotorula, AureobasicUum. Sporobolomyces. and the like. 
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Characteristics of particular interest in selecting a host cell for purposes of 
production include ease of introducing the B. thuringiensis gene into the host, availability 
of expression systems, efficiency of expression, stability of the pesticide in the host, and 
the presence of auxiliary genetic capabilities. Characteristics of interest for use as a 
5 pesticide microcapsule include protective qualities for the pesticide, such as thick cell 
walls, pigmentation, and intracellular packaging or formation of inclusion bodies; leaf 
affmity; lack of mammalian toxicity; attractiveness to pests for ingestion; ease of kiUing 
and fixing witiiout damage to the toxin; and die like. Other considerations include ease 
of formulation and handling, econonucs, storage stability, and the like. 
10 Host organisms of particular interest include yeast, such as Rhodotorula sp.. 

Aureohasidium sp.. Saccharomyces sp.. and Sporobolomyces sp.; phylloplane organisms 
such as Pseudomonas sp.. Erwima sp. and Flavobacterium sp.; or such otiier organisms 
as Escherichia. Lactobacillus sp.. Bacillus' sp.. Streptomyces sp., and the like. Specific 
organisms include Pseudomonas aeruginosa. P. fluorescens. Saccharomyces cerevisiae. 
1 5 B. thuringiensis, B. subtilis. E. coli. Streptomyces lividans and die like. 

Treatment of the microbial cell, e.g., a microbe containing die B. thuringiensis 
toxin gene, can be by chemical or physical means, or by a combination of chemical 
and/or physical means, so long as die technique does not deleteriously affect die 
properties of die toxin, nor diminish die ceUular capability in protecting die toxin. 
20 Examples of chemical reagents are halogenating agents, particularly halogens of atomic 
no. 17-80. More particularly, iodine can be used under mild conditions and for sufficient 
time to achieve die desired results. Odier suitable techniques include treatment widi 
aldehydes, such as formaldehyde and glutaraldehye; anti-infectives. such as zephiran 
chloride and cetylpyridinium chloride; alcohols, such as isopropyl and etfianol; various 
25 histologic fixatives, such as Lugol's iodine. Bourn's fixative, and Kelly's fixatives, (see 
e.g.. Humason, 1967); or a combination of physical (heat) and chemical agents diat 
preserve and prolong die activity of die toxin produced in die cell when die cell is 
administered to a suitable host Examples of physical means are short wavelengdi 
radiation such as y-radiation and X-radiation, fteezing. UV irradiation, lyophilization, and 
30 die like. The cells employed will usually be intact and be substantially in die 
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proliferative form when treated, rather than in a spore form, although in some instances 

spores may be employed. 

Where the B. thuringiensis toxin gene is introduced via a suitable vector into a 
microbial host, and said host is applied to the environment in a living state, it is essential 
that certain host microbes be used. Microorganism hosts are selected which are known to 
occupy the "phytosphere" (phylloplane, phyllosphere, rhizosphere. and/or rhizoplane) of 
one or more crops of interest. These microorganisms are selected so as to be capable of 
successfully competing in the particular environment (crop and other insect habitats) with 
the wild-type microorganisms, provide for stable maintenance and expression of the gene 
expressing the polypeptide pesticide, and, desirably, provide for improved protection of 
the pesticide from environmental degradation and inactivation. 

A large number of microorganisms are known to inhabit the phyUoplane (the 
surface of the plant leaves) and/or the rhiaisphere (the soil surrounding plant roots) of a 
wide variety of important crops. TTiese microorganisms include bacteria, algae, and 
fungi: Of particular interest are microorganisms, such as bacteria, e.g., genera Bacillus, 
Pseudomonas, Erwinia, Serratia. Klebsiella. Zanthomonas. Streptomyces. Rhizobium. 
Rhodopseudomonas. Methylophilius, ' Agrobacterium. Acetobacter. Lactobacillus, 
Arthrobacter. Azotobacter. Leucgnostoc. and Alcaligenes; fungi, particularly yeast, e.g., 
genera Saccharomyces. Cryptococcus, Kluyveromyces, Sporobolomyces, Rhodotorula. 
and Aureobasidium. Of particular interest are such phytosphere bacterial species as 
Pseudomonas syringae. Pseudomonas fluorescens, Serratia marcescens. Acetobacter 
xylinum, Agrobacterium tumefaciens. Rhodobacter sphaeroides, Xanthdmonas 
campestris. Rhizobium melioti. Alcaligenes eutrophus. and Azotobacter vinlandii; and 
phytosphere yeast species such as Rhodotorula rubra. R. glutinis. R. marina. R. 
aurantiaca. Cryptococcus albidus. C. diffluens. C. laurentii. Saccharomyces rosei. 5. 
pretoriensis. S cerevisiae, Sporobolomyces roseus. S odorus. Kluyveromyces veronae, 
and Aureobasidium pollulans. 

53 Definitions 

The following words and phrases have the meanings set forth below. 
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Broad-Spectrum: refers to a wide range of insect species. 

Broad-Spectnim Insecticidal Activity: toxicity towards a wide range of insect 

species. 

Expression: The combination of intracellular processes, including transcription 
and translation undergone by a coding DNA molecule such as a stmctural gene to 
produce a polypeptide. 

Insecticidal Activity: toxicity towards insects. 

Insecticidal Specificity: the toxicity exhibited by a crystal protein towards 

multiple insect species. . , 

Intraorder Specificity: the toxicity of a particular crystal protein towards insect 

species within an Order of insects (e.g.. Order Lepidoptera). 

Interorder Specificity: the toxicity of a particular crystal protein towards insect 

species of different Orders {e.g.. Orders Lepidoptera and Diptera). 

LCso: the lethal concentration of crystal protein that causes 50% mortality of the 

insects treated. 

LC,,: the lethal concentration of crystal protein that causes 95% mortality of the 
insects treated. 

Promoter: A recognition site on a DNA sequence or group of DNA sequences 
that provide an expression control element for a structural gene and to which RNA 
polymerase specificaUy binds and initiates RNA synthesis (transcription) of that gene. 

Regeneration: Tlie process of growing a plant from a plant cell (e.g., plant 

protoplast or explant). 

Structural Gene: A gene that is expressed to produce a polypeptide. 

Transformation: A process of introducing an exogenous DNA sequence {e.g., a 
vector, a recombinant DNA molecule) into a ceU or protoplast in which that exogenous 
DNA is incorporated into a chromosome or is capable of autonomous replication. 

Transformed CeU: A cell whose DNA has been altered by the introduction of an 
exogenous DNA molecule into that cell. 

Transgene: An exogenous gene which when introduced into the genome of a 
host cell through a process such as transformation, electroporalion, particle 
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bombardment, and the like, is expressed by the host cell and integrated into the cells 

genome such that the trait or traits produced by the expression of the transgene is 

inherited by the progeny of the transformed cell. 

Transgenic CeU: Any cell derived or regenerated ftom a transformed cell or 

derived from a transgenic cell. Exemplary transgenic cells include plant caUi derived 

from a transformed plant ceU and particular cells such as leaf, root, stem, e.g., somatic 

cells, or reproductive (germ) cells obtained from a transgenic plant 

Transgenic Plant: A plant or progeny thereof derived from a transformed plant 

cell or, protoplast, wherein the plant DNA contains an introduced exogenous, DNA 
molecule not originally present in a native, non-transgenic plant of the same strain. The 
terms "transgenic plant" and "transformed plant" have sometimes been used in the art as 
synonymous terms to define a plant whose DNA contains an exogenous DNA molecule. 
However, it is thought more scientifically correct to refer to a regenerated plant or callus 
obtained from a transformed plant cell or protoplast as being a transgenic plant, and that 

usage will be followed herein. 

Vector: A DNA molecule capable of replication in a host ceU and/or to which 

another DNA segment can be operatively linked so as to bring about replication of the 
attached segment. A plasmid is an exemplary vector. 

5.4 Probes AND Primers 

In another aspect, DNA sequence information provided by the invention allows 
for the preparation of relatively short DNA (or RNA) sequences having the abUity to 
specifically hybridize to gene sequences of the selected polynucleotides disclosed herein. 
In these aspects, nucleic acid probes of an appropriate length are prepared based on a 
consideration of a selected crystal protein gene sequence. e.g., a sequence such as that 
shown in SEQIDN0:9, SEQIDNOill. SEQIDN0:13, SEQ1DN0:25. 
SEQlDNO:27, SEQ IDNO:29, or SEQ1DN0:33. The ability of such nucleic acid 
probes to specifically hybridize to a crystal protein-encoding gene sequence lends them 
particular utility in a variety of embodiments. Most importantly, the probes may be used 
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in a variety of assays for detecting the presence of complementary sequences in a given 
sample. 

In certain embodiments, it is advantageous to use oligonucleotide primers. The 
sequence of such primers is designed using a polynucleotide of the. present invention for 
use in detecting, amplifying or mutating a defined segment of a crystal protein gene from 
B. thuringiensis usmg PGR™ technology. Segments of related crystal protein genes from 
other species may also be amplified by PGR™ using such primers. 

To provide certain of the advantages in accordance with tiie present invention, a 
preferred nucleic acid sequence employed for hybridization studies or assays includes 
sequences that are complementary to at least a 14 to 30 or so long nucleotide stretch of a 
crystal protein-encoding sequence, such as that shown in SEQ ID N0:9, SEQ ID NO: 1 1, 
SEQ ID NO:13, SEQ ID NO:25. SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:33. A 
size of at least 14 nucleotides in length ^elps to ensure that the fragment will be of 
sufficient length to form a duplex molecule that is both stable and selective. Molecules 
having complementary sequences over .stretches greater than 14 bases in length are 
generally preferred, though, in order to increase stability and selectivity of the hybrid, and 
thereby improve the quality and degree of specific hybrid molecules obtained. One will 
generally prefer to design nucleic acid molecules having gene-complementary stretches 
of 14 to 20 nucleotides, or even longer where desired. Such firagments may be readily 
prepared by. for example, directly synthesizing the fragment by chemical means, by 
application of nucleic acid reproduction technology, such as the PGR™ technology of 
U.S. Patents 4.683,195, and 4.683,202 (each specificaUy incorporated herein by 
reference), or by excising selected DNA fragments from recombinant plasmids containing 
qipit^nciate inserts and suitable restriction sites. 

5^ Expression Vectors 

The present invention contemplates an expression vector comprising a 
polynucleotide of the present invention. Thus, in one embodiment an expression vector is 
an isolated and purified DNA molecule comprising a promoter operatively linked to an 
coding region that encodes a polypeptide of the present invention, which coding region is 
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operatively linked to a transcription-terminating region, whereby the promoter drives the 
transcription of the coding region. 

As used herein, the term "operatively linked" means that a promoter is connected 
to an coding region in such a way that the transcription of that coding region is controUed 
and regulated by that promoter. Means for operatively linking a promoter to a coding 
region are well known in the art. 

Promoters that function in bacteria are well known in the art Exemplary and 
preferred promoters for the Bacillus crystal proteins include the sigA, sigE, and sigK gene 
. promoters. Alternatively, the native, mutagenized, or recombinant crystal protein- 
encoding gene promoters themselves can be used. 

Where an expression vector of the present invention is to be used to transform a 
plant, a promoter is selected that has the abiUty to drive expression in plants. Promoters 
that fimction in plants are also well known in the art. Useful in expressing the 
polypeptide in plants are promoters that are inducible, viral, synthetic, constitutive as 
described (Poszkowski et ai, 1989; Odell et aL, 1985). and temporaUy regulated, 
spatially regulated, and spatio-temporally regulated (Chauc/ al, 1989). 

A promoter is also selected for its ability to direct the transformed plant cell's or 
transgenic plant's transcriptional activity to tiie coding region. Stiiictural genes can be 
driven by a variety of promoters in plant tissues. Promoters can be near-constitutive, 
such as the CaMV 35S promoter, or tissue-specific or developmentally specific promoters 

affecting dicots or monocots. 

Where the promoter is a near-constitutive promoter such as CaMV 35S, increases 
in polypeptide expression are found in a variety of transforaied plant tissues (e.g.. callus, 
leaf, seed and root). Alternatively, tiie effects of transformation can be directed to 
specific plant tissues by using plant integrating vectors containing a tissue-specific 
promoter. 

An exemplary tissue-specific promoter is the lectin promoter, which is specific for 
seed tissue. The Lectin protein in soybean seeds is encoded by a single gene {Lei) tiiat is 
only expressed during seed maturation and accounts for about 2 to about 5% of total seed 
mRNA. The lectin gene and seed-specific promoter have been fully characterized and 
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used to direct seed specific expression in transgenic tobacco plants (Vodkin et ai, 1983; 
Lindstrome/a/., 1990.) 

An expression vector containing a coding region that encodes a polypeptide of 
interest is engineered to be under control of the lectin promoter and that vector is 
5 introduced into plants using, for example, a protoplast transformation method (Dhir et ai , 
1991). The expression of the polypeptide is directed specifically to the seeds of the 
transgenic plant 

A transgenic plant of the present invention produced from a plant cell transformed 
with a tissue specific promoter can be crossed with a second transgenic plant developed 

1 0 from a plant cell transformed with a different tissue specific promoter to produce a hybrid 
transgenic plant that shows the effects of transformation in more than one specific tissue. 

Exemplary tissue-specific promoters are com sucrose synthetase I (Yang et aL, 
1990), com alcohol dehydrogenase 1 (Vogel et ai, 1989), com light harvesting complex 
(Simpson, 1986), com heat shock protein (Odell et ai, 1985), pea small subunit RuBP 

15 carboxylase (Poulsen et ai, 1986; Cashmore et ai, 1983), Ti plasmid mannopiiTe 
synthase (Langridge et ai, 1989), Ti plasmid nopaline synthase (Langridge et ai, 1989), 
petunia chalcone isomerase (Van Tunen et ai, 1988), bean glycine rich protein 1 (Keller 
et al., 1989), CaMV 35s transcript (Odell et ai, 1985) and Potato patatin (Wenzler et ai, 
1989), Preferred promoters are the cauliflower mosaic virus (CaMV 35S) promoter and 

20 the S-E9 small subunit RuBP carboxylase promoter. 

The choice of which expression vector and ultimately to which promoter a 
polypeptide coding region is operatively linked depends directiy on the fimctional 
properties desired, e.g., tiie location and timing of protein expression, and the host cell to 
be transformed These are well known limitations inherent in the art of constructing 

25 recombinant DNA molecules. However, a vector usefiil in practicing tiie present 
invention is capable of directing the expression of the polypeptide coding region to which 
it is operatively linked. 

Typical vectors usefiil for expression of genes in higher plants are well known, in 
die art and include vectors derived from the tumor-inducing (Ti) plasmid of 

30 Agrobacterium tumefaciens described (Rogers et ai , 1 987). However, several other plant 
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integrating vector systems are known to function in plants including pCaMVCN transfer 
control vector described (Fronun ei al, 1985). pCaMVCN (available from Pharmacia, 
Piscataway, NJ) includes the cauliflower mosaic virus CaMV 35S promoter. 

In preferred embodiments, the vector used to express the polypeptide includes a 
selection marker that is effective in a plant cell, preferably a drug resistance selection 
marker. One preferred drug resistance marker is the gene whose expression results in 
kanamycin resistance; i.e., the chimeric gene containing the nopaline synthase promoter, 
TnJ neomycin phosphotransferase II {nptll) and nopaline synthase 3N non-translated 
regiondescribed(Rogersef a/., 1988). - 

RNA polymerase transcribes a coding DNA sequence through a site where 
polyadenylation occurs. Typically, DNA sequences located a few hundred base pairs 
downstream of the polyadenylation site serve to terminate transcription. Those DNA 
sequences are referred to herein as transcription-termination regions. Those regions are 
required for efficient polyadenylation of transcribed messenger RNA (mRNA). 

Means for preparing expression vectors are well known in the art Expression 
(transformation vectors) used to transform plants and methods of making those vectors 
are described in U. S. Patents 4,971,908, 4,940,835, 4,769,061 and 4,757,011 (each of 
which is specifically incorporated herein by reference). Those vectors can be modified to 
include a coding sequence in accordance with the present invention. 

A variety of methods has been developed to operatively link DNA to vectors via 
complementary cohesive termini or blunt ends. For instance, complementary 
homopolymer tracts can be added to the DNA segment to be inserted and to the vector 
DNA. The vector and DNA segment are then joined by hydrogen bonding between the 
complementary homopolymeric tails to form recombinant DNA molecules. 

A coding region that encodes a polypeptide having the ability to confer 
insecticidal activity to a cell is preferably a chimeric B. thuringiensis crystal protein- 
encoding gene. In preferred embodiments, such a polypeptide has the amino acid residue 
sequence of SEQIDNOilO, SEQIDN0:12, SEQIDN0:14, SEQIDNO:26, 
SEQ ID NO:28, SEQ ID NO:30, or SEQ ID NO:34; or a functional equivalent of one or 
more of those sequences. In accordance with such embodiments, a coding region 
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comprising the DNA sequence of SEQIDN0:9, SEQIDNO:!!. SEQIDN0:13, 
SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:33 is also preferred. 



5.6 Transformed or Transgenic Plant Cells 

5 A bacterium, a yeast cell, or a plant cell or a plant transformed with an expression 

vector of the present invention is also contemplated. A transgenic bacterium, yeast cell, 
plant cell or plant derived from such a transformed or transgenic cell is also 
contemplated. Means for transforming bacteria and yeast cells are well known in the art. 
Typically, means pf transforaiation are similar to those well known means used to 

1 0 transform other bacteria or yeast such as £. coli or 5. cerevisiae. 

Methods for DNA transformation of plant cells include Agrobacterium-medieiti^d 
plant transformation, protoplast transformation, gene transfer into pollen, injection into 
reproductive organs, injection into immature embryos and particle bombardment. Each 
of these methods has distinct advantages and disadvantages. Thus, one particular method 

15 of introducing genes into a particular plant strain may not necessarily be the most 
efiTective for another plant strain, but it is well known which methods are useful for a 
particular plant strain. 

There are many methods for introducing transforming DNA segments into cells, 
but not ail are suitable for delivering DNA to plant cells. Suitable methods are believed 

20 to include virtually any method by vrtiich DNA can be introduced into a cell, such as 
infection by A. tumefaciens and related Agrobacterium, direct delivery of DNA such as, 
for example, by PEG-mediated transformation of protoplasts (Omirulleh et al, 1993), by 
desiccation/inhibition-mediated DNA uptake, by electroporation, by agitation with silicon 
carbide fibers, by acceleration of DNA coated particles, etc. In certain embodiments, 

25 acceleration methods are preferred and include, for example, microprojectile 
bombardment and the like. 

Technology for introduction of DNA into cells is well-known to those of skill in 
the art. Four general methods for delivering a gene into cells have been described: (1) 
chemical methods (Graham and van der Eb, 1973); (2) physical methods such as 

30 microinjection (Capecchi, 1980), electroporation (Wong and Neumann, 1982; Fromm 
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etaL, 1985) and the gene gun (Johnston and Tang, 1994; Fynan eiai, 1993); (3) viral 
vectors (Clapp, 1993; Lu e/a/., 1993; Eglitis and Anderson. 1988a; 1988b); and (4) 
receptor-mediated mechanisms (Curiel etaL, 1991; 1992; Wagner e/ a/., 1992). 

5.6* 1 Electroporation 

The application of brief, high-voltage electric pulses to a variety of animal and 
plant cells leads to the formation of nanometer-sized pores in the plasma membrane. 
DNA is taken directly into the cell cytoplasm either through these pores or as a 
consequence of the redistribution of membrane components that accompanies closure of 
the pores. Electroporation can be extremely efficient and can be used both for transient 
expression of clones genes and for establishment of cell lines that carry integrated copies 
of the gene of interest. Electroporation, in contrast to calcium phosphate-mediated 
transfection and protoplast fusion, frequently gives rise to cell lines that carry one, or at 
most a few, integrated copies of the foreign DNA. 

The introduction of DNA by means of electroporation, is well-known to those of 
skill in the art. In this method, certain cell wail-degrading enzymes, such as pectin- 
degrading enzymes, are employed to render the target recipient cells more susceptible to 
transformation by electroporation than untreated cells. Alternatively, recipient cells are 
made more susceptible to transformation, by mechanical wounding. To effect 
transformation by electroporation one may employ either friable tissues such as a 
suspension culture of cells, or embryogenic callus, or alternatively, one may transform 
immature embryos or other organized tissues directly. One would partially degrade the 
cell walls of the chosen cells by exposing them to pectin-degrading enzymes 
(pectolyases) or mechanically wounding in a controlled manner. Such cells would then 
be recipient to DNA transfer by electroporation, which may be carried out at this stage, 
and transformed cells then identified by a suitable selection or screening protocol 
dependent on the nature of the newly incorporated DNA. 
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S.6.2 MiCROPROJECTILE BOMBARDMENT 

A further advantageous method for delivering transforming DNA segments to 
plant cells is microprojectile bombardment In this method, particles may be coated with 
nucleic acids and delivered into cells by a propelling force. Exemplary particles include 

5 those comprised of tungsten, gold, platinum, and the like. 

An advantage of microprojectile bombardment, in addition to it being an effective 
means of reproducibly stably transforming monocots, is that neither the isolation of 
protoplasts (Cristou et ai, 1988) nor the susceptibility to Agrobacterium infection is 
required. An illustrative embodiment of a n^ethod for delivering DNA into maize cells by 

1 0 acceleration is a Biolistics Particle Delivery System, which can be used to propel particles 
coated with DNA or cells through a screen, such as a stainless steel or Nytex screen, onto 
a filter surface covered with com cells cultured in suspension. The screen disperses the 
particles so that they are not delivered to the recipient cells in large aggregates. It is 
believed that a screen intervening between the projectile apparatus and the cells to be 

15 bombarded reduces the size of projectiles aggregate and may contribute to a higher 
frequency of transformation by reducing damage inflicted on the recipient cells by 
projectiles that are too large. 

For the bombardment, cells in suspension are preferably concentrated on filters or 
solid culture medium. Alternatively, inunature embryos or other target cells may be 

20 arranged on solid culture mediimi. The cells to be bombarded are positioned at an 
appropriate distance below the macroprojectile stopping plate. If desired, one or more 
screens are also positioned between the acceleration device and the cells to be 
bombarded. Through the use of techniques set forth herein one may obtain up to 1000 or 
more foci of cells transiently expressing a marker gene. The number of cells in a focus 

25 which express the exogenous gene product 48 hours post-bombardment often range from 
1 to 10 and average 1 to 3. 

In bombardment transformation, one may optimize the prebombardment culturing 
conditions and the bombardment parameters to yield the maximum numbers of stable 
transformants. Both the physical and biological parameters for bombardment are 

30 important in this technology. Physical factors are those that involve manipulating the 
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DNA/microprojectile precipitate or those that affect the flight and velocity of either the 
macro- or microprojectiles. Biological fectors include all steps involved in manipulation 
of cells before and immediately after bombardment, the osmotic adjustment of target cells 
to help alleviate the trauma associated with bombardment, and also the nature of the 

5 transforming DNA, such as linearized DNA or intact supercoiled plasmids. It is believed 
that pre-bombardment manipulations are especially important for successful 
transformation of immature embryos. 

Accordingly, it is contemplated that one may wish to adjust various of the 
bombardment parameters in small scale studi^ to fully optimize the conditions. One 

10 may particularly wish to adjust physical parameters such as gap distance, flight distance, 
tissue distance, and helium pressure. One may also minimize the trauma reduction 
factors (TRFs) by modifying conditions which influence the physiological state of the 
recipient cells and which may therefore influence transformation and integration 
efficiencies. For example, the osmotic state, tissue hydration and the subculture stage or 

15 cell cycle of the recipient cells may be adjusted for optimum transformation. The 
execution of other routine adjustments will be known to those of skill in the art in light of 
the present disclosure. 

The methods of particle-mediated transformation is well-known to those of skill 
in the art U. S. Patent 5,015,580 (specifically incorporated herein by reference) 

20 describes the transformation of soybeans using such a technique. 

5.63 Agrobacterium-Medixted Transfer 

Agrobacterium-m^iaiQd transfer is a widely applicable system for introducing 
genes into plant cells because the DNA can be introduced into whole plant tissues, 

25 thereby bypassing the need for regeneration of an intact plant firom a protoplast. The use 
of i4gro6ac/erium-mediated plant integrating vectors to introduce DNA into plant cells is 
well known in the art. See, for example, the methods described (Fraley et al, 1985; 
Rogers et ai, 1987). The genetic engineering of cotton plants using Agrobacterium- 
mediated transfer is described in U. S. Patent 5,004,863 (specifically incorporated herein 

30 by reference), while the transformation of lettuce plants is described in U. S. Patent 
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5,349,124 (specifically incorporated herein by reference). Further, the integration of the 
Ti-DNA is a relatively precise process resulting in few rearrangements. The region of 
DNA to be transferred is defined by the border sequences, and intervening DNA is 
usually inserted into the plant genome as described (Spielmann ei ai, 1986; Jorgensen et 
5 a/., 1987). 

Modem Agrobacterium transformation vectors are capable of replication in E, coli 
as well as Agrobacterium, allowing for convenient manipulations as described (Klee et 
aL, 1985). Moreover, recent technological advances in vectors for Agrobacterium- 
mediated gene transfer have improved the arrangement of genes and restriction, sites in 

10 the vectors to facilitate construction of vectors capable of expressing various polypeptide 
coding genes. The vectors described (Rogers et al, 1987), have convenient multi-linker 
regions flanked by a promoter and a polyadenylation site for direct expression of inserted 
polypeptide coding genes and are suitable for present purposes. In addition, 
Agrobacterium containing both armed and disarmed Ti genes can be used for the 

15 transformations. In those plant strains where i4gro*flc/er/Mm-mediated transformation is 
efficient, it is the method of choice because of the facile and defined nature of the gene 
transfer. 

^groZ^acfermffi-mediated transformation of leaf disks and other tissues such as 
cotyledons and hypocotyls appears to be limited to plants that Agrobacterium naturally 

20 infects. Agrobacterium-m&dizitd transformation is most efficient in dicotyledonous 
plants. Few monocots appear to be natural hosts for Agrobacterium, although transgenic 
plants have been produced in asparagus using Agrobacterium vectors as described 
(Bytebier et al, 1987). Therefore, conunercially important cereal grains such as rice, 
com,, and wheat must usually be transformed using alternative methods. However, as 

25 mentioned above, the transforaiation of asparagus using Agrobacterium can also be 
achieved (see, e.g, Bytebier et ai, 1987). 

A transgenic plant formed using Agrobacterium transformation methods typically 
contains a single gene on one chromosome. Such transgenic plants can be referred to as 
being heterozygous for the added gene. However, inasmuch as use of the word 

30 "heterozygous" usually implies the presence of a complementary gene at the same locus 
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of the second chromosome of a pair of chromosomes, and there is no such gene in a plant 
containing one added gene as here, it is believed that a more accurate name for such a 
plant is an independent segregant, because the added, exogenous gene segregates 
independently during mitosis and meiosis. 

More preferred is a transgenic plant that is homozygous for the added structural 
gene; Le. , a transgenic plant that contains two added genes, one gene at the same locus on 
each chromosome of a chromosome pair. A homozygous transgenic plant can be 
obtained by sexually mating (selfing) an independent segregant transgenic plant that 
contains a single added gene, germinating some of the seed produced and analyzing the 
resulting plants produced for enhanced carboxylase activity relative to a control (native, 
non-transgenic) or an independent segregant transgenic plant 

It is to be understood that two different transgenic plants can also be mated to 
produce offspring that contain two independently segregating added, exogenous genes. 
Selfing of appropriate progeny can produce plants that are homozygous for both added, 
exogenous genes that encode a polypeptide of interest Back-crossing to a parental plailt 
and out-crossing with a non-transgenic plant are also contemplated. 

Transformation of plant protoplasts can be achieved using methods based on 
calcium phosphate precipitation, polyethylene glycol treatment, electroporation, and 
combinations of these treatments (see, e.g., Potrykus et aL, 1985; Lorz et al,, 1985; 
Fromm et ai, 1985; Uchimiya et al, 1986; Callis et al, 1987; Marcotte et al, 1988). 

Application of these systems to different plant strains depends upon the ability to 
regenerate that particular plant strain firom protoplasts. Illustrative methods for the 
regeneration of cereals firom protoplasts are described (see, e.g., Fujimura et al, 1985; 
Toriyamae/fl/., 1986; Yamadae/a/., 1986; Abdullah e/ a/., 1986). 

To transform plant strains that cannot be successfiilly regenerated fiom 
protoplasts, other ways to introduce DNA into intact cells or tissues can be utilized. For 
example, regeneration of cereals from inunature embryos or explants can be effected as 
described (Vasil, 1988). In addition, "particle gun" or high-velocity microprojectile 
technology can be utilized (Vasil, 1 992). 
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Using that latter technology, DNA is carried, through the cell wall and into the 
cytoplasm on the surface of small metal particles as described (Klein et ai, 1987; Klein et 
ai, 1988; McCabe et ai, 1988). The metal particles penetrate through several layers of 
cells and thus allow the transformation of cells within tissue explants. 

5 

5.7 Production of Insect-Resistant Transgenic Plants 

Thus, the amount of a gene coding for a polypeptide of interest (/.e., a bacterial 
crystal protein or polypeptide having insecticidal activity against one or more insect 
species) can be increased in plant such as com by transforming those plants using particle 

10 bombardment methods (Maddock et al, 1991). By way of example, an expression vector 
containing a coding region for a B. thuringiensis crystal protein and an appropriate 
selectable marker is transformed into a suspension of embryonic maize (com) cells using 
a particle gim to deliver the DNA coated on microprojectiles. Transgenic plants are 
regenerated from transformed embryonic calli that express the disclosed insecticidal 

15 crystal proteins. Particle bombardment has been used to successfully transform whefit 
(Vasil era/., 1992). 

DNA can also be introduced into plants by direct DNA transfer into pollen as 
described (Zhou et al, 1983; Hess, 1987; Luo et al., 1988). Expression of polypeptide 
coding genes can be obtained by injection of the DNA into reproductive organs of a plant 

20 as described (Pena et ai, 1987). DNA can also be injected directly into the cells of 
immature embryos and the rehydration of desiccated embryos as described (Neuhaus et 
aL, 1987; Benbrook et ai, 1986). 

The development or regeneration of plants from either single plant protoplasts or 
various^ esqilants is well known in the art (Weissbach and Weissbach, 1988). This 

25 regeneration and growth process typically includes the steps of selection of transformed 
cells, culturing those individualized cells through the usual stages of embryonic 
development through the rooted plantlet stage. Transgenic embryos and seeds are 
similarly regenerated. The resulting transgenic rooted shoots are thereafter planted in an 
appropriate plant growth medium such as soil. 
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The development or regeneration of plants containing the foreign, exogenous gene 
that encodes a polypeptide of interest introduced by Agrobacterium from leaf explants 
can be achieved by methods well known in the art such as described (Horsch et ai, 
1985). In this procedure, transformants are cultured in the presence of a selection agent 
and in a medium that induces the regeneration of shoots in the plant strain being 
transformed as described (Fraley et ai, 1983). In particular, U. S. Patent 5,349,124 
(specification incorporated herein by reference) details the creation of genetically 
transformed lettuce cells and plants resulting therefirom which express hybrid crystal 
proteins conferring insecticidal activity against Lepidopteran larvae to such plants. 

This procedure typically produces shoots within two to four months and those 
shoots are then transferred to an appropriate root-inducing medium containing the 
selective agent and an antibiotic to prevent bacterial growth. Shoots that rooted in the 
presence of the selective agent to form plantiets are then transplanted to soil or other 
media to allow the production of roots. These procedures vary depending upon the 
particular plant strain employed, such variations being well known in the art. 

Preferably, the regenerated plants are self-pollinated to provide homozygous 
transgenic plants, as discussed before. Otherwise, pollen obtained from the regenerated 
plants is crossed to seed-grown plants of agronomically important, preferably inbred 
lines. Conversely, pollen from plants of those important lines is used to pollinate 
regenerated plants. A transgenic plant of the present invention containing a desired 
polypeptide is cultivated using methods well known to one skilled in the art 

A transgenic plant of this invention thus has an increased amount of a coding 
region {e.g., a cry gene) that encodes one or more of the Chimeric Cry polypeptides 
disclosed herein. A preferred transgenic plant is an independent segregant and can 
transmit that gene and its activity to its progeny. A more preferred transgenic plant is 
homozygous for that gene, and transmits that gene to all of its ofiTspring on sexual mating. 
Seed from a transgenic plant may be grown in the field or greenhouse, and resulting 
sexually mature transgenic plants are self-pollinated to generate true breeding plants. The 
progeny from these plants become true breeding lines that are evaluated for, by way of 
example, increased insecticidal capacity against Coleopteran insects, preferably in the 
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field, under a range of environmental conditions. The inventors contemplate that the 
present invention will find particular utility in the creation of transgenic com, soybeans, 
cotton, wheat, oats, barley, other grains, vegetables, fruits, finit trees, berries, turf grass, 
ornamentals, shrubs and trees. 

5 

6. Examples 

The following examples are included to demonstrate preferred embodiments of 
the invention. It should be appreciated by those of skill in the art that the techniques 
disclosed in the examples which follow represent techniques discovered by the inventor 
10 to fimction well in the practice of the invention, and thus can be considered to constitute 
preferred modes for its practice. However, those of skill in the art should, in light of the 
present disclosure, appreciate that many changes can be made in the specific 
embodiments which are disclosed and still obtain a like or similar result without 
departing from the spirit and scope of the invention. 

15 

6.1 Example 1 - Construction of Hybrid B. thuringeensis 5-Endotoxins 

The A thuringiensis shutde vectors pEG853, pEG854, and pEG857 which are 
used in the present invention have been described (Baum et al, 1990). pEG857 contains 
the CrylAc gene cloned into pEG853 as an Sphl-BamlH DNA fragment pEG1064 was 

20 constructed in such a way that the Kpn\ site within the cry 1 Ac gene was preserved and the 
Kpnl site in the pEG857 multiple cloning site (MCS) was eliminated. This was 
accomplished by sequentially subjecting pEG857 DNA to limited Kpnl digestion so that 
only one KprA site is cut, filling in the Kpnl 5' overhang by Klenow fragment of DNA 
polymonse I to create blunt DNA ends, and joining the blunt ends of DNA by T4 DNA 

25 ligase; pEG318 contains the cry/F gene (Chambers et cd„ 1991) cloned into theXhol site 
of pEG854 as anXholSall DNA fr^ment. pEG315 contains the crylC gene from strain 
EG6346 (Chambers et al, 1991) cloned into the Xhol-Bamm sites of pEG854 as a SaH- 
BamVfi DNA firagment. 

FIG. lA shows a schematic representation of the DNA encoding the complete 

30 crylAc, crylAby crylC, and crylF genes contained on pEG854/pEGl064, pEG20, 
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pEG315, and pEG318, respectively. Unique restriction sites that were used in 
constructing certain hybrid genes are also shown. FIG. IB shows a schematic 
representation of hybrid genes pertaining to the present invention. In some cases standard 
PCR™ amplification with mutagenic oligonucleotide primers were used to incorporate 
5 appropriate restrictions sites into DNA fragments used for hybrid gene construction. 
Certain hybrid gene constructions could not be accomplished by restriction fragment 
subcloning. In those instances, PGR™ overlap extension (FOE) was used to construct the 
desired hybrid gene (Horton et aL, 1989). The following oligonucleotide primers 
(purchased firom Integrated DNA Technologies Inc., Coralville, lA) were used: 

10 



Primer 


A: 


5' 


-GGATAGCACTCATCAAAGGTACC-3' (SEQ ID N0:1) 


Primer 


B: 


5' 


-GAAGATATCCAATTCGAACAGTTTCCC-3' (SEQ ID NO: 2) 


Primer 


C: 


5' 


-a^TATTCTGCCTCGAGTGTTdCAGTAAC-3' (SEQ ID NO: 3) 


Primer 


D: 


5' 


-CCCGATCGGCCGCATGC-3' (SEQ ID NO: 4) 


Primer 


E: 


5' 


-CATTGGAGCTCTCCATG-3' (SEQ ID NO: 5) 


Primer 


F: 


5' 


-GCACTACGATGTATCC-3' (SEQ ID NO:6) 


Primer 


G: 


5' 


-CATCGTAGTGCAACTCTTAC-3' (SEQ ID NO: 7) 


Primer 


H: 


5' 


-CaUVGAAAATACTAGAGCTCTTGTTAAAAAAGGTGTTCC-3' (SEQ ID NO: 8) 


Primer 


I: 


5' 


-ATTTGAGTAATACTATCC-3' (SEQ ID NO: 23) 


Primer 


J: 


5' 


-ATTACTCAAATACCATTGG-3' (SEQ ID NO: 24) 


Primer 


K: 


5' 


-TCGTTGCTCTGTTCCCG-3' (SEQ ID NO: 31) 



The plasmids described in FIG. IB containing the hybrid 8-endotoxm genes 
pertinent to this invention are described below. Isolation or purification of DNA 

25 fragments generated by restriction of plasmid DNA, PCR™ amplification, or POE refers 
to the sequential application of agarose-TAE gel electrophoresis and use of the Geneclean 
Kit (Bio 101) following the manufacturer's recommendation. pEG106S was constructed 
by PCR™ amplification of the cry IF DNA fragment using primer pair A and B and 
pEG3 1 8 as the DNA template. The resulting PCR™ product was isolated, cut with Asull 

30 and Kpnl, and used to replace the corresponding AsulUKpnl DNA fragment in pEG857. 
Plasmid pEG1067 was constructed using POE and DNA fragments Saul-fQpnl of crylF 
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and Asull-Clal of cry 1 Ac that were isolated from pEG3l8 and pEG857, respectively. 
The resulting POE product was PGR™ amplified with primer pair A and B, cut with 
Asull and Kpnl, and used to replace the corresponding Asull-Kpnl fragment in pEG8S7. 
pEG1068 was constructed by replacing the Sacl-Kpnl DNA fragment of cry I Ac 

5 isolated from pEG857 with the corresponding Sacl-Kpnl DNA fragment isolated from 
cry IF (pEG318). pEGlOTO was constructed by replacing the Sacl-Kpnl DNA fragment 
isolated from pEG1065 with the corresponding Sacl-Kpnl DNA fragment isolated from 
crylAc (pEG857). pEG1072 was constructed by replacing the Sacl-Kpnl DNA fragment 
isolated from pEG1067 with the corresponding SacUKpnl DNA fragment isolated from 

10 crylAc (pEG857). pEG1074, pEG1076, and pEG1077 were constnicted by replacing the 
SphhXhol DNA fragment from pEG1064 with the PGR™ amplified SphhXhol DNA 
fragment from pEG1065, pEG1067, pEG1068, respectively, using primer pairs G and D. 
pEG1089 was constructed by replacing the Sphl-Sacl DNA fragment of pEG1064 with 
the isolated and Sphl and Sad cut PGR™ product of crylF that was generated using 

1 S primer pair D and E and the template pEG3 1 8. 

pEG1091 was constructed by replacing the Sphl-Sacl DNA fragment of pEG1064 
with the isolated and Sphl and Sad cut PGR™ product of crylC that was generated using 
primer pair D and H and the template pEG3 IS. 

pEG1088 was constructed by POE using a crylAc DNA fragment generated using 

20 primer pair B and F and a crylC DNA fi^igment generated using primer pair A and G. 
The SacUKpnl fragment was isolated from the resulting POE product and used to replace 
the corresponding Sacl-f^nl fragment in pEG1064. 

pEG36S was constructed by furst replacing the Sphl-Kpnl DNA fragment from 
pEG106S with the corresponding crylAb DNA fragment isolated from pEG20 to give 

25 pEG364. The Sacl-Kpnl DNA fragment from pEG364 was then replaced with the 
corresponding cry IF DNA fragment isolated ftom pEG3 1 8. 

pEG1092 was constructed by replacing the Kpnl-BamVl DNA fragment from 
pEG1088 with the corresponding DNA fragment isolated from pEG315. pEG1092 is 
distinct from the crylAb/crylC hybrid 8-endotoxin gene disclosed in Intl. Pat. Appl. 

30 Publ. No. WO 95/06730. 
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pEG1093 was constructed by replacing the Sphl-Asull DNA fiagment from 
pEG1068 with the corresponding Sphl-Asull DNA fiagment isolated from pEG20. 

pEG378 was constructed by POE using a crylAc DNA fragment generated using 
primer pair B and I using pEG857 as the template and a crylF DNA fragment generated 
using primer pair A and J using pEG3 18 as the template. The resulting POE product was 
cut with Asull and Kpnl and the resulting isolated DNA fragment used to replace the 
corresponding i45un-^rtl DNA fragment in pEG1064. 

pEG381 was constructed by replacing the AsuII-XhoI DNA fragment in pEGl064 
with the corresponding Asull-JOiol DNA firagment isolated fix>m the PCR™ amplification 
of pEG378 using primer pair C and K. 

6.2 Example 2 - Production of the Hybrid Toxins in A thuringiensis 

The plasmids encoding the hybrid toxins described in Example 1 were 
transformed into B. thuringiensis as described (Mettus and Macaluso, 1990). The 
resulting B. thuringiensis strains were grown in 50 ml of C-2 medium until the culture 
was fiilly sporulated and lysed (approximately 48 hr.). Since crystal formation is a 
prerequisite for efiBcient commercial production of 5-endotoxins in B. thuringiensis, 
microscopic analysis was used to identify crystals in the sporulated cultures (Table 4). 
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Table 4 

Crystal Formation by the Hybrid 5-Endotoxins 



Strain 


Plasmid 


Parent S-Endotoxins 


Crystal 
Formation 


EG 11060 


pEG1065 


CrylAc + CrylF 


+ 


EGl 1062 


pEG1067 


CrylAc + CrylF 


+ 


EG11063 


pEG1068 


CrylAc + CrylF 


+ 


EGU065 


pEG1070 


CrylAc + CrylF 





EGl 1067 


pEG1072 


CrylAc + CrylF 





EGl 1071 


pEG1074 


CrylAc + CrylF 


+ 


EGl 1073 


pEG1076 


CrylAc + CrylF 


+ 


EGl 1074 


pEG1077 


CrylAc + CrylF 


+ 


EGl 1087 


pEG1088 


CrylAc + CrylC 


_ 


EGl 1088 


pEG1089 


CrylF + CrylAc 


_ 


EGl 1090 


pEG1091 


CrylC + CrylAc 


- 


EGl 1091 


pEG1092 


CrylAc + CrylC 


+ 


EGl 1092 


pEG1093 


CrylAb + CrylAc + CrylF 


+ 


EGl 1735 


pEG365 


Cryl Ab + CrylF + CrylAc 




EGl 1751 


pEG378 


CrylAc + CrylF 


+ 


EGl 1768 


pEG381 


CrylAc + CrylF 





The 5-endotoxm production for some of the A thuringiensis strains specified in 
Table 4 was examined by sodium dodecyi sulfate-poiyacrylamide gel electrophoresis 
(SDS-PAGE) as described by Baum et ai, 1990. Equal volume cultures of each fl. 
thuringiensis strain were grown in C-2 medium until fully sporulated and lysed. The 
cultures were centrifuged and the spore/crystal pellet was washed twice with equal 
volumes of distilled deionized water. The final pellet was suspended in half the culture 
volume of 0.005% Triton X-100®. An equal volume of each washed culture was 
analyzed by SDS-PAGE as shown in FIG. 2. 
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The majority of hybrids involving Cry 1 Ac and Cry IF formed stable crystals in A 
thuringiensis A notable exception is EGl 1088 in which the active toxin fragment would 
be the reciprocal exchange of EGl 1063. Two of the three hybrids involving Cry 1 Ac and 
Cry IC, EGl 1087 and EGl 1090, failed to produce crystal in A thuringiensis even though 
these reciprocal hybrids mimic the activated toxin fragments of crystal-forming EGl 1063 
and EGl 1074. 

Every strain that was examined by SDS-PAGE produced some level of 
5*endotoxin. As expected, however, those cultures identified as crystal negative 
produced very little protein (e.g., lane e: EGl 1065, lane f: EGl 1067, lane j: EGl 1088, 
and lane k: EGl 1090), For reference, typical yields from a crystal forming 5-endotoxin 
is shown for Cryl Ac (lane a). Several hybrid 5-endotoxins produce comparable levels of 
protein including EGl 1060 (lane b), EGl 1062 (lane c), EGl 1063 (lane d; 
SEQIDNO:10), and EGl 1074 (lane i; SEQ IDNO:12). The data clearly show that 
efficient hybrid 5-endotoxin production in A thuringiensis is unpredictable and varies 
depending on the parent 5-endotoxins used to construct the hybrid. 

63 Example 3 - Proteolytic Processing of the Hybrid 6-Endotoxins 

Proteolytic degradation of the protoxin form of the 6-endotoxin to a stable active 
toxin occurs once 5-endotoxin crystals are solubilized in the larval midgut One measure 
of the potential activity of 5-endotoxins is the stability of the active 5-endotoxin in a 
proteolytic environment. To test the proteolytic sensitivity of the hybrid 5-endotoxins, 
solubilized toxin was subjected to trypsin digestion. The 6-endotoxins were purified 
from sponilated A thuringiensis cultures and quantified as described (Chambers et al, 
1991). Exactly 250 \x% of each hybrid 5-endotoxin crystal was solubilized in 30 mM 
NaHCOj, 10 mM DTT (total volume 0.5 ml). Trypsin was added to the solubilized toxin 
at a 1 :10 ratio. At appropriate time points 50 ^1 aliquots were removed to 50 jil Laemmli 
buffer, heated to 100°C for 3 min., and frozen in a dry-ice ethanol bath for subsequent 
analysis. The trypsin digests of the solubilized toxins were analyzed by SDS-PAGE and 
the amount of active 5-endotoxin at each time point was quantified by densitometry. A 
graphic representation of the results from these studies are shown in FIG. 3. 
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The wild-type Cry 1 Ac is rapidly processed to the active 5-endotoxin fragment 
that is stable for the duration of the study. The hybrid 5-endotoxins from EGl 1063 and 
£011074 are also processed to active 5-endotoxin fragments which are stable for the 
duration of the study. The processing of the EGl 1063 5-endotoxin occurs at a slower rate 
5 and a higher percentage of this active 5-endotoxin fragment remains at each time point. 
Although the hybrid 5-endotoxins from EG 11 060 and EGl 1062 are process to active 

5- endotoxin fragments, these fragments are more susceptible to further cleavage and 
degrade at various rates during the course of the study. The 5' exchange points between 
cry I Ac and crylF for the EGl 1062 and EGl 1063 5-endotoxins result in toxins that differ 

10 by only 21 amino acid residues (see FIG. 1). However; the importance of maintaining 
Cry 1 Ac sequences at these positions is evident by the more rapid degradation of the 
EGl 1062 5-endotoxin. These data demonstrate that different hybrid 5-endotoxins 
constructed using the same parental 5-endotoxins can vary significantly in biochemical 
characteristics such as proteolytic stability. 

15 

6- 4 Example 4 - BiOAcrivrrY of the Hybrid 5-Endotoxins 

A thuringiensis cultures expressing the desired 5-endotoxin were grown until 
fully sporulated and lysed and washed as described in Example 2. The 5-endotoxin levels 
for each culture were quantified by SDS-PAGE as described (Baum et aL, 1990). In the 

20 case of bioassay screens, a single appropriate concentration of each washed 5-endotoxin 
culture was topically applied to 32 wells containing 1 .0 ml artificial diet per well (surface 
area of 175 mm^). A single neonate larvae was placed in each of the treated wells and the 
tray covered by a clear perforated mylar sheet. Larvae mortality was scored after 7 days 
of feeding and percent mortality expressed as the ratio of the number of dead larvae to the 

25 total number of larvae treated, 32. 

In the case of LC50 determinations (5-endotoxin concentration giving 50% 
mortality), 5-endotoxins were purified from the A thuringiensis cultures and quantified 
as described by Chambers et al. (1991). Eight concentrations of the 5-endotoxins were 
prepared by serial dilution in 0,005% Triton X-100® and each concentration was topically 

30 applied to wells containing 1.0 ml of artificial diet. Larvae mortality was scored after 7 
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days of feeding (32 larvae for each 5-^ndotoxin concentration). In all cases the diluent 
served as the control. 

A comparison of the Cry 1 A/Cry IF hybrid toxins by bioassay screens is shown in 
Table 5. The hybrid 5-endotoxins from strains EG11063 and EG1I074 maintain the 
activities of the parental CrylAc and CrylF 5-endotoxins. Furthermore, the hybrid 
5-endotoxin from EG 11735 maintains the activity of its parental CrylAb and CrylF 
5-endotoxins. The 5-endotoxins produce by strains EG 11061, EG 11062, EGl 1071, and 
EG11073 have no insecticidal activity on the insect larvae tested despite 1) being 
comprised of at least one parental 5-endotoxin tl^lt is active against the indicated larvae 
and 2) forming stable, well-defined crystals in B. thuringiensis. These results 
demonstrate the unpredictable nature of hybrid toxin constructions. 

For the data in Table 5. All strains were tested as washed sporulated cultures. For 
each insect tested, equivalent amounts of 5-endotoxins were used and insecticidal activity 
was based on the strain showing the highest percent mortality ( i i i i ). 
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Table 5 

BiOASSAY Screens of Hybrid CryI A/CryIF 6-Endotoxins 



Strain 


S. frugiperda 


5. exigua 


H. virescens 


H.zea 


O. nubilatis 


Cry 1 Ac 




- 


++++ 


M t i. 


+-H- 


Cry IF 


++++ 






++ 


++ 


CrylAb 


++ 


+ 




-H- 




EG11060 






• 


- 


- 


EG11062 












EG11063 


-H-K- . 


++++ 






■H-H- 


EG11071 












EG11073 












EG11074 


++++ 


Mil 


+++ 




-H-H- 


EG11090 




■H-+ 








EG11091 


MM 


M 1 t 






N.D. 


EG11092 


MM 


M 1 i . 


H-H- 


-HH- 


N.D. 


EG11735 


+-H-f ' 


MM 


+++ 


•HH- 


N.D. 


EG11751 


N.D.' 


MM 


N.D. 


M M 


N.D. 



*N.D. = not determined. 



The 5-endotoxins described in FIG. 1 and that demonstrated insecticidal activity 
in bioassay screens were tested as purified crystals to determine their LC50 {see Table 6). 
The 5-endotoxins purified from strains EGl 1063, EGl 1074, EGl 1091, and EGl 1 735 all 
show increased armyworm (S. frugiperda and S, exigua) activity compared to any of the 
wild-type S-endotoxins tested. The EGl 1063 and EGl 1074 5-endotoxins would yield 
identical active toxin fragments (FIG. IB) which is evident by their similar LC50 values 
on the insects examined. An unexpected result evident from these data is that a hybrid 
5-endotoxin such as EGl 1063, EGl 1092, EGl 1074, EGl 1735, or EGl 1751 can retain 
the activity of their respective parental 6-endotoxins, and, against certain insects such as 
S exiguay can have activity far better than either parental S-endotoxin. This broad range 
of insecticidal activity at doses close to or lower than the parental 5-endotoxins, along 
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with the wild-type level of toxin production (Example 2), make these proteins particularly 
suitable for production in B, thuringiensis. Although the EG11091 derived 5-endotoxin 
has better activity against S. frugiperda and S. exigua than its parental 5-endotoxins, it 
has lost the H. virescens and H. zea activity attributable to its CrylAc parent. This 
restricted host range along with lower toxin yield observed for the EG11091 5-endotoxin 
(Example 2) make it less amenable to production in B. thuringiensis. 



Table 6 

LCjo Values for the Purified Hybrid 5-Endotoxin^ 



Toxin 


S, frugiperda 


S. exigua 


H. virescens 


H.zea 


O. nubilaiis 


CrylAc 


>10000 


>10000 


9 


100 


23 


CrylAb 


1435 


4740 


118 


400 


17 


CrylC 


>10000 


490 


>10000 


>10000 


>10000 


CrylF 


1027 


3233 


54 


800 


51 


EG11063 


550 


114 


33 


80 


T 


(CrylAc/lF) 












EG11074 


468 


77 


25 


76 


9 


(CiylAc/lF) 












EG11091 


21 


21 


219 


>10000 


N.D.' 


(CrylAc/lC) 













®N.D.=not determined. 



In Table 6, the LC50 values are expressed in nanograms of purified 6-endotoxin 
per well (175 mm^) and are the composite values for 2 to 6 replications, nd = not 
detemuned. 
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Table 7 describes the DNA surrounding the S' and 3' exchange points for the 
hybrid 5-endotoxins which are pertinent to the present invention. As evident by the SEQ 
ID NO, certain hybrid 5-endotoxins share exchange sites. 

To examine the effect of other small changes in the exchange site chosen for 
hybrid endotoxin construction, the activity of EG 11 751 and EGl 1063 on S. exigua and 
K zea were compared (Table 8). The data clearly show that hybrid 5-endotoxin 
improvements can be made by altering the exchange site between the two parental 
5-endotoxins. In this example, the exchange site in the EGl 1751 S-endotoxin was moved 
75 base pairs 3' compared to the EGl 1063 5-endotoxin and results in itnproved 
insecticidal activity. Although no significant improvement in S. exigua activity is 
observed between EGl 1063 and EGl 1751, a significant improvement in K zea activity 
of almost 4-fold is observed for EGl 1751. It is important to note that improvements in 
hybrid 5-endotoxin bioactivity by altering exchange sites is unpredictable. In the case of 
EGl 1062, moving the exchange site 63 base pairs 5' of the EGl 1063 exchange site 
abolishes insecticidal activity as shown in Table 7. 



Table 8 

Bioactivity of EG11063 and EG11751 



B. thuringiensis Strain 


LCso Values for Washed Sporulated Cultures 




S. exigua H. zea 


EGl 1063 


106 38 


EGl 1751 


90 10. 



To further examine the effect of changes in the exchange site for hybrid 
5-endotoxins, the hybrid 5-endptoxin encoded by pEG381 was compared to those 
encoded by pEG378 and pEG1068. In this example, the 3' exchange site for the pEG38l 
encoded hybrid 5-endotoxin was moved 340 base pairs 5' compared to the pEG378 
hybrid 5-endotoxin. The data in Table 8 show that this change results in an increase in S. 
frugiperda activity compared to the pEG378 and pEG106(S encoded 5-endotoxins while 
maintaining the increased activity that was observed for the pEG378 encoded 
5-endotoxin over the pEG1068 encoded 5-endotoxin (see Table 7). This result is 
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unexpected since the activated toxin resulting from the proteolysis of the encoded 
5-endotoxins from pEG378 and pEG381 should be identical. This example further 
demonstrates that exchange sites within the protoxin fragment of 6-endotoxins can have a 
profound effect on insecticidal activity. 

Table 9 

BiOACTiviTY OF Toxins Encoded by pEG378, pEG381 and pEG1068 



Plasmid LCso Values for Purified Crystals 





S, firugiperda 


Tni 


H, zea. 


P. j^lostella 


pEG378 


464 


57.7 


37.5 


3.02 


pEG381 


274 


56.0 


36.6 


2.03 


pEG1068 


476 


66.7 


72.7 


3.83 



6*5 Example 5 - Activity of the Hybrid Toxins on ADDrriONAL Pests 

The toxins of the present invention were also assayed against additional pests, 
including the southwestern com borer and two pests active against soybean. Toxin 
proteins were solubilized, added to diet and bioassayed against target pests. The hybrid 
toxins showed very effective control of all three pests. 
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Table 10 

LC50 AND ECso Ranges of Hybrid Toxins on Southwestern Corn Borer' 





EG 11063 


EG11074 


EG11091 


EG117S1 


LCso 


20 


10-20 


10-20 


10-20 


ECso 


0.2-2 


0.2-2 


0.2-2 


0.2-2 



All values are expressed in ^g/ml of diet. 

^SWCB data ranges represent LC50 and EC50 ranges (as determined by % >lst 
5 instar), respectively. 

Table 11 

LCen Values of Chimeric Crystal Proteins on Soybean Pests' 



Pest 


EG11063 EG^1074 


EG11091 


EG117S1 


EG11768 


Velvetbean cateipillar' 


0.9 0.6 


0.3 


0.1 


0.06 


Soybean looper 

1 . .. . J 


0.9 0.8 


0.6 


0.7 


02 



All values are expressed in ^g/ml of diet. 



^Velvetbean caterpillar (Anticarsia gemmatalis) and soybean looper {Psuedoplusi 
1 0 includens) are both members of the family Noctuidae. 

6.6 Example 6 - Amino Acid Sequences of the Novel Crystal Proteins 
6.6.1 Amino Acid Sequence of the EG11063 Crystal Protein (SEQ ID NO:10) 

MetAspTlsnAsnProAsnZleAsnGluCysIleProTyrAsnCysLeuSerAsnProGluValGluValLeu 
1 5 GlyGlyGliiArglleGluThrGlyTyrThrProIleAspIleSerLeixSerLeuThirGl^ 

GluPheValProGlyAlaGlyPheValLeuGlyLeuValAspIlelleTrpGlyllePheGlyProSerGln 
TrpAspiaaPheLeuValGlnlleGluGlnLeuIleAsnGlnArglleGluGluPheMaArgAsnGlnAla 
IleSerArgLeuGluGlyLeuSerAsnLeuTyrGlnlleTyrAlaGluSerPheArgGluTrpGluAlaAsp 
ProThrJ^nProAlcOjeiiArgGluGluMetArglleGlnPheAsnAspMetAsnSerMaLe^^ 
20 ileProLeuPheMaValGlnAsnTyrGlnValProLeuLeuSerValTyrValGlnAlaAlaAsnLeu^ 

LeuSerValLeiiArgAspValSerValPheGlyGlnAr^rpGlyPheAspAlaAlaThrlleAsnSerAr^ 
TyrAsnAspLeuThrArgLeuIleGlyAsnTyrThrAspTyrAlaValArgTrpTyrAs 
ArgValTrpGlyProAspSerArgAspTrpValArgTyrAsnGlnPheArgArgGluLeuThrLeuThrVal 
LeuAspIleValAlaLeuPheProAsnTyrAspSerArgArgTyrProIleArgThrValSerGlnLeuTh^ 
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ArgGluIleTyrThrAsnProValLeuGluAsnPheAspGlySerPheArgGlySerAlaGlnGlylleGlu 
ArgSerlleArgSerProHisLeuMetAspIleLeuAsnSerlleThrlleTyrThrAspAlaHisArgGly 
TyrTyrTyrTrpSerGlyHisGlnlleMetAlaSerProValGlyPheSerGlyProGluPheThrPhePro 
LeuTyrGlyThrMetGlyAsnAlaAlaProGlnGlnArglleValAlaGlnLeuGlyGlnGlyValTyrArg 
5 ThrLeuSerSerThrLeuTyrArgArgProPheAsnlleGlylleAsnAsnGlnGlnLeuSerValLeuAsp 
GlyThrGluPheAlaTyrGlyThrSerSerAsnLeuProSerAlaValTyrArgLysSerGlyThrValAsp 
SerLeuAspGluIleProProGlnAsnAsnAsnValProProArgGlnGlyPheSerHisArgLeuSerHis 
ValSerMetPheArgSerGlyPheSerAsnSerSerValSerllelleArgAlaProMetPheSerTrpThr 
HisArgSerAlaThrProThrAsnThrlleAspProGluArglleThrGlnlleProLeuValLysAlaHis 

10 ThrLeuGlnSerGlyThrThrValValArgGlyProGlyPheThrGlyGlyAspIleLeuArgArgThrSer 
GlyGly ProPheAl aTyrThr I leValAsnl leAsnGlyGlnLeuProGlnArgivi^Ar^ 
TyrAlaSerThrThrAsnLeuArglleTyrValThrValAlaGlyGluArgllePheAlaGlyGln^^ 
LysThrMetJVspThrtSlyAspProLeuThrPheGlnSerPheSerTyrAlaThrlleAsnThrAlaPheTto 
PheProMetSerGlnSerSerPheThrValGlyAlaAspThrPheSerSerGlyAsnGluValTyrlleAsp 

15 ArgPheGlxiLeuIleProValThrAlaThrPheGluAlaGluTyrAspI-euGluArgAlaGlnLysAlaVal 
AsnAlaLeuPheThrSerlleAsnGlnXleGlylleLysThrAspValThrAspTyrHisIleAspGlnVal 
SerAsnLeuValAspCysZieuSerAspGluPheCYsLexiAspGluLysArgGltiLeuSerGluLysValL^ 
HisAlaLysArgLeuSerAspGluArgAsiUjeuLeuGlnAspProAsnPheLysGlylleAsnArgGlxiL^^ 
J\spArgGlyTrpArgGlySerThrAspIleThrIleGlnArgGlyAspAspValPheLysGluAsnTyrVal 

20 ThrLeuProGlyThrPheAspGluCysTyrProThrTyrLeuTyrGlnLysIleAspGlxiSerLysLeuLys 
AlaPheThrArgTyrGlnLeiiArgGlyTyrlleGluAspSerGlnAspLeuGluIleTyrL^ 
AsnAlaLysHisGluThrValAsnValProGlyThrGlySerLeuTrpProLeuSerAlaGlnSerProIle 
GlyLysCVsGlyGluProAsnAr^CysMaProHisLeuGluTrpAsnProAspLeuAspCysSerCys 
AspGlyGluLysCysAlaHisHisSerHisHisPheSerLexiAspIleAspValGlyCysThrAspLeuAsn 

25 GlixAspLeuGlyValTrpValllePheLysIleLysThrGlnAspGlyHisAlaArgLeuGlyAsnLeuGlu 
PheLeuGluGlxja^ysProLeuValGlyGluAleOieuAlaArgValLysArgAlaGluLysLysTrpArgAsp 
LysArgGluLysLeuGluTrpGluThrAsnlleValTyrLysGliiAlaLysGliiSerValAspAlaLeuPhe 
ValAsnSerGlnTyrAspGlnLeuGlnMaAspThrAsnlleAlaMetlleHisAlaAlaAspLysArgVal 
HisSerlleArgGluAlaTyrLeuProGluLeuSerVallleProGlyValAsnAlaAlallePheGluGlu 

30 LeuGluGlyAxgllePheThrAlaPheSerLeuTyrAspAlaArgAsnVallleLysAsxiGlyAspPheAsn 
AsnGlyLeuSerCysTrpAsnValLysGlyHisValAspValGluGluGlnAsxiAsnGInArgSerValLeu 
ValValProGluTrpGl\aAlaGluValSerGlnGluValArgValCysProGlyArgGlyTyrIleLeuArg 
ValThrAlaTyrLysGluGlyTyxGlyGluGlyCysValThrlleHisGluIleGltiAsnAsnThrAspGlu 
LeuLysPheSerAsnCysValGluGluGluIleTyrProAsnAsnThrValThrCysAsnAspTyrTh^ 

35 AsnGlnGluGluTyrGlyGlyAlaTyrThrSerArgAsnArgGlyTyrAsnGluAlaProSerValProAla 
AspTyrAlaSerValTyrGluGluLysSerTyrThrAspGlyArgArgGluAsnProCysGluPheAsnArg 
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GlyTyrArgAspTyrThrProLeuProValGlyTyrValThrLysGluLeuGluTyrPheProGluThrAsp 
LysValTrpIleGluIleGlyGluThrGluGlyThrPhelleValAspSerValGluLeuLeuLeuMeCGlu 
Glu 

5 6.62 AMINO ACID Sequence of the EG11074 Crystal Protein (SEQ ID NO:12) 

MetAspAsnAsxiProAsnlleAsnGluCysIleProTyrAsnCysLeuSerAsnProGluValGluValLeu 
GlyGlyGluArglleGluThrClyTyrThrProIleAspIleSerLeuSerLeuThrGlnPheLeuLeuSer 
GluPheValProGlyAlaGlyPheValLeuGlyLeuValAspIlelleTrpGlyllePheGlyProSerGln 
TrpAspAlaPheLeuValGlnlleGluGlnLeuIleAsnGlnArglleGluGluPheAlaArgAsnGlnAla 

10 ileSerArgLeuGluGlyLeuSerAsnLeuTyrGlnlleTyrAlaGluSerPheArgGluTrpGluAlaAsp 
ProThrAsnProAlaLeuArgGluGlxiMetArglleGlnPheAsnAspMetAsnSerAlaLeuThrt 
IleProLeuPheAlaValGlnAsnTyrGlnValProLexiLeuSerValTyrValGlnAlaAlaAsnLe^ 
LeuSerValLeuArgAspValSerValPheGlyGlnArgTrpGlyPheAspAlaAlaThrlleAsnSerArg 
TyrAsnAspLeuThrArgLeuIleGlyAsnTyrThrAspTyrAlaValArgTrpTyrAsnThrGlyLeuGlu 

15 AxgValTrpGlyProAspSerArgAspTrpValArgTyrAsnGlnPheArgArgGluLeuThrLeuThrVal 
LeuAspIleValAlaLeuPheProAsnTyrAspSerAr^ArgTyrProIleArgThrValSerGlnLeuThr 
ArgGluIleTyrThrAsnProValLeuGluAsnPheAspGlySerPheArgGlySerAlaGlnGlylleGlu 
ArgSerlleArgSerProHisLeuMetAspIleLeuAsnSerlleThrHeTyrThrAspAlaHisArgGly 
TyrTyrTyrTrpSerGlyHisGlnlleMetAlaSerProValGlyPheSerGlyProGluPheThrPhePro 

20 i^uTyrGlyThrMetGlyAsnAlaMaProGlnGlnArglleValAlaGlnLeuGlyGlnGlyValTyrAr 
ThrLeuSerSerThrLeuTyrArgArgProPheAsnlleGlylleAsnAsnGlnGlnLeuSerValLeuAsp 
GlyThrGluPheAlaTyrGlyThrSerSerAsxiLeuProSerAlaValTyrArgLysSerGlyThrValAsp 
SerLeuAspGluIleProProGlnAsnAsnAsnValProProArgGlnGlyPheSerHisArgLeuSerHis 
ValSerMetPheArgSerGlyPheSerAsnSerSerValSerllelleArgAlaProMetPheSerTrpThr 

25 HisArgSerAlaThrProThrAsnThrlleAspProGliiArglleThrGlnlleProLeuValLysAlaHis 
ThrLeuGlnSerGlyThrThrValValArgGlyProGlyPheThrGlyGlyAspIleLeuArgArgThrSer 
GlyGlyProPheAlaTyrThrlleValAsnlleAsnGlyGlnLeuProGlnArgTyrArgAlaArglleArg 
TyrAlaSerThrThrAsxiLeuArglleTyrValThrValAlaGlyGliiArgllePheAlaGlyGlnPh^ 
LysThrMetAspThrGlyAspProLeuThrPheGlnSerPheSerTyrAlaThrlleAsnThrAlaPheThr 

30 PheProMetSerGlnSerSerPheThrValGlyAlaAspThrPheSerSerGlyAsnGluValTyrlleAsp 
ArgPheGluLeuIleProValThrAlaThrLeuGliJjaaGluTyrAsnLeuGluArgAlaGlnLysAlaVal 
AsnAlaLeuPheThrSerThrAsnGlnLeuGlyLeuLysThrAsnValThr AspTyrHi s I leAsp^ 
SerAsnLeuValThrTyrLeuSerAspGluPheCysLeuAspGluLysArgGluLeuSerGluLysValLys 
HisAlaLysArgLeuSerAspGluArgAsnLeuLeuGlnAspSerAsnPheLysAspIleAsnArgGlnPro 

35 GluArgGlyTrpGlyGlySerThrGlylleTlirlleGlnGlyGlyAspAspValPheLysGliiAsnT^ 
Thrl^uSerGlyThrPheAspGluCysTyrProThrTyrLeuTyrGlnLysIleAspGluSerLys 
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AlaPheThrArgTyrGlnLeuArgGlyTyr I leGluAspSerGlnAspLeuGluI leTyrLeuI leArgTyr 

AsnAlaLysHisGluThrValAsnValProGlyThrGlySerLeuTrpProLeuSerAlaGlnSerProIle 

GlyLysCysGlyGluProAsnArgCysAlaProHisLeuGluTrpAsnProAspLexiAspCysSerCysArg 

AspGlyGluLysCysAlaHisHisSerHisHisPheSerLeuAspIleAspValGlyCysThrAspLeuAsn 

GluAspLeuGiyValTrpVal I lePheLys I leLysThrGlnAspGlyHisAlaArgLeuGlyAsnLeuGlu 

PheLeuGluGluLysProLeuValGlyGluAlaLeuAlaArgValLysArgAlaGluLysLysTrpArgAsp 

LysArgGluLysLeuGluTrpGluThrAsnlleValTyrLysGluAlaLysGluSerValAspAlaLeuPhe 

ValAsnSerGlnTyrAspGlnLeuGlnAlaAspThrAsnlleAlaMetlleHisAlaMaAspLysi^^ 

HisSerlleArgGluAlaTyrLeuProGluLeuSerVallleProGlyValAsnAlaAlallePheGluGlu 

LeuGluGlyArgllePheThrAlaPheSerLeuTyrAspAlaArgAsnVallleLysAsnGlyAspPheAsn 

AsnGlyteuSerCysTrpAsnValLysGlyHisValAspValGluGluGlnAsnAsnGlnAigSerVal^ 

ValValProGluTrpGluAlaGluValSerGlnGluValArgValCysProGlyArgGlyTyrlleLeuArg 

ValThrAlaTyrLysGluGlyTyrGlyGluGlyCysValThrlleHisGluIleGluAsnAsnThrAspG^ 

LeuLysPheSerAsnCysValGluGluGluIleTyrProAsnAsnThrValThrCysAsnAspTyrt 

AsnGlnGluGluTyrGlyGlyAlaTyrThrSerArgAsnArgGlyTyrAsnGlti^ 

AspTyrAlaSerValTyrGluGluLysSerTyrThrAspGlyArgArgGluAsnProCysGluPheA^ 

GlyTyrArgAspTyrThrProLeuProValGlyTyrValThrLysGliU-euGluTyrPhePro^ 

LysValTrplleGluIleGlyGluThrGluGlyThrPhelleValAspSerValGluLeuLeuLeuMetGlu 

Glu 

6.63 AMINO ACID Sequence of the EGl 1735 Crystal Protein (SEQ ID NO: 14) 

MetAspAsnAsnProAsnlleAsnGluCysIleProTyrAsnCysLeuSerAsnProGluValGluValLeu 

GlyGlyGluArglleGluThrGlyTyrThrProlieAspIleSerLeuSerLeuThrGlnPheLeiiLeuSer 

GluPheValProGlyAlaGlyPheValLeuGlyLeuValAspIlelleTrpGlyllePheGlyProSerGln 

TrpAspAlaPheLeuValGinlleGluGlnLeuIleAsnGlnArglleGluGluPheAlaArgAsnGlnA^ 

IleSerArgLeuGluGlyLeuSerAsnLeuTyrGlnlleTyrAlaGluSerPheArgGluTrpG 

ProThrAsnProAlal-euArgGluGluMetArglleGlnPheAsnAspMetAsnSerAlaLeu^ 

IleProLeuPheAlaValGlxiAsnTyrGlnValProLeuLeuSerValTyrValGlnAlaAl 

LeuSei^ValLeiiArgAspValSerValPheGlyGlnArgTrpGlyPheAspAlaAlaTte 

TyrAsnAspLeuThrArgLeuIleGlyAsnTyrThrAspHisAlaValArgTrpTyrAsnT^ 

ArgValTrpGlyProAspSerArgAspTiT>IleArgTyrAsnGlnPheArgArgGl\iIieuThrLeuThr^^ 

LexiAspIleValSerLeuPheProAsnTyrAspSerArgThrTyrProIleAr^ 

ArgGluIleTyrThrAsnProValLeuGluAsnPheAspGlySerPheArgGlySerAlaGlnGlylleGlu 
GlySerlleArgSerProHisLeuMetAspIleLeuAsnSerlleThrlleTyrThrAspAlaHisArgGly 
GluTyrTyrTrpSerGlyHisGlnlleMetAlaSerProValGlyPheSerGlyProGluPheThrPhePro 
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LeuTyrGlyThrMetGlyAsnAlaAlaProGlnGlnArglleValAlaGlnLeuGlyGlnGlyValTyrArg 
ThrLeuSerSerThrLeuTyrArgArgProPheAsnlleGlylleAsnAsnGlnGlnLeuSerValLeuAsp 
GlyThrGluPheAlaTyrGlyThrSerSerAsnLeuProSerAlaValTyrArgLysSerGlyThrValAsp 
SerLeuAspGluIleProProGlnAsnAsnAsnValProProArgGlnGlyPheSerHisArglieuSerHis 
5 valSerMetPheArgSerGlyPheSerAsnSerSerValSerllelleArgAlaProMetPheSerTrpThr 
HisArgSerAlaThrProThrAsnThrlleAspProGluArglleThrGlnlleProLeuValLysAlaHis 
ThrLeuGlnSerGlyThrThrValValArgGlyProGlyPheThrGlyGlyAspIleLeiiArgArgThrSer 
GlyGlyProPheAlaTyrThrlleValAsnlleAsnGlyGlnLeuProGlnArgTyrArgAlaArglle 
TyrAlaSerThrThrAsnLetJUrglleTyrValThrValAlaGlyGluArgllePheAlaGlyGlnPheAsn 

10 LysThrMetAspThrGlyAspProLeuThrPheGlnSerPheSerTyrAlaThrlleAsnThrAlaPheThr 
PheProMetSerGlnSerSerPheThrValGlyAlaAspThrPheSetSerGlyAsnGluValTyrlleAsp 
ArgPheGliiLeuIleProValThrAlaThrPheGliiAlaGluTyrAspLeuGlxiArgAlaGlnLysAlaVal 
AsnAlaLeuPheThrSerlleAsnGlnlleGlylleLysThrAspValThrAspTyrHisIleAspGlnVal 
SerAsnLeuValAspCysLeuSerAspGluPheCysLeuAspGluLysArgGluLeuSerGluLysValLys 

15 HisAlaLysArgLeuSerAspGlxjArgAsnLeuLeuGlnAspProAsnPheLysGlylleAsnArgGlnLeu 
AspArgGlyTrpArgGlySerThrAspIleThrlleGlnArgGlyAspAspValPheLysGluAsnTyrVal 
ThrI*euProGlyThrPheAspGluCysTyrProThrTyrLeuTyrGlnLysIleAspGluSerLysI^ 
AlaPheThrArgTyrGlrxLeuArgGlyTyrIleGluAspSerGlnAspLeuGluIleTyrLeuIleArgT>^ 
AsnAlaLysHisGluThrValAsnValProGlyThrGlySerLeuTrpProLeuSerAlaGlnSerProIle 

20 GlyLysCysGlyGluProAsnArgCysAlaProHisLeuGluTrpAsnProAspLeixAspCysSerCysArg 
AspGlyGl\Jd.ysCysAlaHisHisSerHisHisPheSerLeuAspIleAspValGlyCysThrAspLeiiAsn 
GluAspLeuGlyValTrpVal I lePheLys I leLysThrGlnAspGlyHi sAlaArgLeuGlyAsnLeuGlu 
PheLeuGluGliiLysProLeuValGlyGluAlaLeuAlaArgValLysArgAlaGluLysLysTrpArgAsp 
LysArgGlxiLysLeuGluTrpGluThrAsnlleValTyrLysGluAlaLysGluSerValAspAlaLeuPhe 

25 valAsxiSerGlnTyrAspGlnLeuGlnAlaAspThrAsnlleAlaMetlleHisAlaAlaAspLysJ^ 

HisSerlleArgGluAlaTyrLeuProGluLeuSerVallleProGlyValAsnAlaAlallePheGluGlu 
LeuGluGlyArgllePheThrAlaPheSerLeuTyrAspAlaArgAsnVallleLysAsnGlyAspPheAsn 
AsnGlyLeuSerCysTrpAsnValLysGlyHisValAspValGluGluGlnAsnAsnGlnArgSerValLeu 
ValValProGluTrpGluAlaGluValSerGlnGluValArgValCysProGlyArgGlyTyrlleLeuArg 

30 valThrAlaTyrLysGluGlyTyrGlyGluGlyCysValThrlleHisGluIleGlxiAsnAsnThrAspGlu 
LeuLysPheSerAsnCVsValGluGluGluIleTyrProAsiiAsnThrValThrCysAsnAspTyrThrVa^ 
AsnGlnGluGluTyrGlyGlyAlaTyrThrSerArgAsnArgGlyTyrAsnGltiAlaProSerValProAla 
AspTyrAlaSerValTyrGluGluLysSerTyrThrAspGlyArgArgGluAsnProCysGluPhe^ 
GlyTyrArgAspTyrThrProLeuProValGlyTyrValThrLysGliiLeuGluTyrPheProGluT^ 

35 LysValTrpIleGluIleGlyGluThirtSluGlyThrPhelleValAspSerValGluLeuIiexUieuMetGlu 

Glu 
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6,6.4 AMINO Acid Sequence of the EG11092 Crystal Protein (SEQ ID NO:26) 

MetAspAsnAsnProAsnlleAsnGluCysIleProTyrAsnCysLeuSerAsnProGluValGluValLeu 
GlyGlyGlviArglleGluThrGlyTyrThrProIleAspIleSerLeuSerLeuThrGlnPheLeuJLeuSer 
5 GluPheValProGlyAlaGlyPheValLeuGlyLeuValAspIlelleTrpGlyllePheGlyProSerGln 
TrpAspAlaPheLeuValGlnlleGluGlnLeuIleAsnGlnArglleGluGluPheAlaArgAsnGlnAla 
IleSerArgLeuGluGlyLeuSerAsnLeuTyrGlnlleTyrAlaGluSerPheArgGluTrpGluAlaAsp 
ProThrAsnProAlaLeuArgGluGluMetArglleGInPheAsnAspMetJ^nSerAlaLeuThrThrAla 
IleProLeuPheAlaValGlnAsnTyrGlnValProLeiilieuSerValTyrValGlnAlaAlaAsiiLexa^ 

10 LeuSerValLeuArgAspValSerValPheGlyGlnArgTiTpGlyPheAspAlaAlaTh^ 

TyrAsnAspLeuThrArgLeuIleGlyAsnTyxThrAspHisAlaValArgTrpTyrAsnThrGl^ 
ArgValTrpGlyProAspSerArgAspTrpIleArgTyrAsnGlnPheArgArgGluLeuThrLeuThrVal 
LeuAspIleValSerLeiiPheProAsnTyrAspSer/UrgThrTyrProIleArgThrValSerGlnLeuThr 
ArgGluIleTyrThrAsnProValLeuGluAsnPheAspGlySerPheArgGlySerAlaGlnGlylleGlu 

15 ArgSerlleArgSerProHisLeuMetAspIleLeiiAsnSerlleThrlleTyrThrAspAlaHisi^ 

TyrTyrTyrTrpSerGlyHisGlnlleMetAlaSerProValGlyPheSerGlyProGluPheThrPhePro 
LeuTyrGlyThrMetGlyAsnAlaAlaProGlnGlnArglleValAlaGlnLeuGlyGlnGlyValTyrArg' 
ThrLeuSerSerThrLeuTyrArgArgProPheAsnlleGlylleAsnAsnGlnGlnLeuSerValLeiiAsp 
GlyThrGluPheAlaTyrGlyThrSerSerAsnLeuProSerAlaValTyrArgLysSerGlyThrValAsp 

20 SerLeuAspGluIleProProGlnAsnAsnAsnValProProArgGlnGlyPheSerHisArgLeuSerHis 
ValSerMetPheArgSerGlyPheSerAsnSerSerValSerllelleArgAlaProMetPheSerTrpThr 
HisArgSerAlaThrProThrAsnThrlleAspProGluArglleThrGlnlleProIieuValLysAlaHis 
ThrLeuGlnSerGlyThrThrValValArgGlyProGlyPheThrGlyGlyAspIleLeuArgAr^^ 
GlyGlyProPheAlaTyrThrlleValAsnlleAsnGlyGlnLeuProGlnArgTyrArgAlaArgl^ 

25 TyrAlaSerThrThrAsnLeioArglleTyrValThrValAlaGlyGluArgllePheAl^^ 

LysThrMetAspThrGlyAspProLeuThrPheGlnSerPheSerTyrAlaThrlleAsnT^ 
PheProMetSerGlnSerSerPheThrValGlyAlaAspThrPheSerSerGlyAsnGluValTyrlleAsp 
ArgPheGluLeuIleProValThrAlaThrPheGlxaAlaGluTyrAspLeuGluArgAlaGlnLysA^ 
AsnAlaLeuPheThrSerlleAsnGlnlleGlylleLysThrAspValThrAspTyrHisIleAspGlnVal 

30 SerAsnLeuValAspCysLeuSerAspGluPheCysLeuAspGluLysArgGluLeiiSerGl\iLysValLy 
HisAlaLysArgLeuSerAspGluArgAsnLeiilieuGlnAspProAsnPheLysGlylleAsnArgGlnLeu 
AspArgGlyTrpArgGlySerThrAspIleThrlleGlnArgGlyAspAspValPheLysGlxiAsnTy^ 
ThrLeuPrcKSlyThrPheAspGluCysTyrProThrTyrLeuTyrGlnLysIleAspGluSer^ 
AlaPheThrArgTyrGlnLeuArgGlyTyrlleGluAspSerGlnAspLeuGluIleTyrLeuIleA^ 

35 AsnAlaLysHisGluThrValAsnValProGlyThrGlySerLeuTrpProLexiSerAlaGlnSerProIle 
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GlyLysCYsGlyGluProAsnArgCysAlaProHisLeuGluTrpAsnProAspLeuAspCysSerCysAr 

AspGlyGluLysCysAlaHisHisSerHisHisPheSerLeuAspIleAspValGlyCysThrAspLeuAsn 

GluAspLeuGlyValTrpValllePheLysIleLysThrGlnAspGlyHisAlaArgLeuGlyAsnLeuGlu 

PheLeuGluGluLysProLeuValGlyGluAlaLeuAlaArgValLysArgAlaGluLysLysTrpArgAsp 

LysArgGliiLysLeuGluTrpGluThrAsnlleValTyrLysGluAlaLysGluSerValAspAlaLeuPhe 

ValAsnSerGlnTyrAspGlnLeuGlnAlaAspThrAsnlleAlaMetlleHisAlaAlaAspLysArgVal 

HisSerlleArgGluAlaTyrLeuProGluLeuSerVallleProGlyValAsnAlaAlallePheGluGlu 

LeuGluGlyArgllePheThrAlaPheSerLeuTyrAspAlaArgAsnVallleLysAsnGlyAspPheAsn 

AsnGlyLeuSerCysTrpAsnValLysGlyHisValAspValGluGluGlnAsnAsnGlnArgSerValLeu 

ValValProGluTrpGluAlaGluValSerGlnGluValArgValCysProGlyArgGlyTyrlleLeuArg 

ValThrAlaTyrLysGluGlyTyrGlyGluGlyCysValThrlleHisGluIleGluAsnAsnThrAspGlu 

LexiLysPheSerAsnCysValGluGluGluIleTyrProAsnAsnThrValThrCysAsnAspTy^ 

AsnGlnGluGluTyrGlyGlyAlaTyrThrSerArgAsnArgGlyTyrAsnGluAlaProSerVal 

AspTyrAlaSerValTyrGluGltiLysSerTyrThrAspGlyArgArgGluAsnProCy^ 

GlyTyrArgAspTyrThrProLeuProValGlyTyrValThrLysGluLeuGluTyrPheProGluTh 

LysValTrpIleGluIleGlyGluThrGluGlyThrPhelleValAspSerValGluLeuLeuLeuMetGlu 

Glu . 

6.6.5 Amino Acid Sequence of the JEG11751 Crystal Protein (SEQ ID NO:28) 

MetAspAsnAsnProAsnlleAsnGluCysIleProTyrAsnCysLeuSerAsnProGluValGluValLeu 

GlyGlyGluArglleGluThrGlyTyrThrProIleAspIleSerLeuSerLeuThrGlnPheLeuLeuSer 

GluPheValProGlyAlaGlyPheValLeuGlyLeuValAspIlelleTrpGlylLePheGlyProSerGln 

TrpAspAlaPheLeuValGlnlleGluGlnlieuIleAsnGlnArglleGluGluPheAlaArgAsnGlnAla 

IleSerArgLeuGluGlyLeuSerAsnLeuTyrGlnlleTyrAlaGluSerPheArgGluTrpGluAlaAsp 

ProThrAsnProAlaLeiiArgGluGluMetArglleGlxiPheAsnAspMetAsnSer^ 

IleProLeuPheAlaValGlnAsnTyrGlnValProLeuLeiiSerValTyrValGlnAlaAlaAsnL^ 

LeuSerValLeiJArgAspValSerValPheGlyGlnArgTrpGlyPheAspMaAlaThrlleAsnSerA^ 

TyrAsnaspLeuThrArgLeuIleGlyAsnTyrThrAspTyrMaValArgTr^ 

ArgValTrpGlyProAspSerArgAspTrpValArgTyrAsnGlnPheArgArgGluLeuThrL^^ 

LexiAspIleValAlaLeuPheProAsiiTyrAspSerArgArgTyrProIleArgThrValSerGlnLeuThr 

ArgGluIleTyrThrAsnProValLeuGluAsnPheAspGlySerPheArgGlySerAlaGlnGlylleGlu 

ArgSerlleArgSerProHisLeuMetAspIleLeuAsnSerlleThrlleTyrThrAspAlaHisArgGly 

TyrTyrTyrTrpSerGlyHisGlnlleMetJaaSerProValGlyPheSerGlyProGluPhe^^^ 

I^uTyrGlyThrMetGlyAsnAlaAlaProGlnGlnArglLeValAlaGlnLeuGlyGlnGlyValTy^ 

ThrLeuSerSerThrLeuTyrArgArgProPheAsnlleGlylleAsnAsnGlnGlnLeuSerValLeuAsp 
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GlyThrGluPheAlaTyrGlyThrSerSerAsnLeuProSerAlaValTyrArgLysSerGlyThrValAsp 
SerLeuAspGluIleProProGlnAsnAsnAsnValProProArgGlnGlyPheSerHisArgLeuSerHis 
ValSerMetPheArgSerGlyPheSerAsnSerSerValSerllelleArgAlaProMetPheSerTrpIle 
HisArgSerAlaGluPheAsnAsnllelLeAlaSerAspSerlleThrGlnlleProLeuValLysAlaHis 
5 ThrLeuGlnSerGlyThrThrValValArgGlyProGlyPheThrGlyGlyAspIleLeuArgArgThrSer 
GlyGlyProPheAlaTyrThrlleValAsnlleAsnGlyGlxUieuProGlnArgTyrArgAla;^ 
TyrAlaSerThrThrAsnLeuArglleTyrValThrValAlaGlyGluArgllePheAlaGlyGlnPheA^ 
LysThrMetAspThrGlyAspProLeuThrPheGlnSerPheSerTyrAlaThrlleAsnThrAl^ 
PheProMetSerGlnSerSerPheThrValGlyAlaAspThrPheSerSerGlyAsnGluValTyrlleAsp 
10 ArgPheGluLeuIleProValThrAlaThrPheGluAlaGluTyrAspLeuGluArgAlaGlnLysAlaVal 
AsnAlaLeuPheThrSerlleAsnGliilleGlylleLysTteTVspValThrAspTyrHi^ 
SerAsnLeuValAspCysLeuSerAspGluPheCysLeuAspGluLysArgGluLeuSerGluLysValLys 
HisAlaLysArgLeuSerAspGluArgAsnLeiJjJeuGlnAspProAsiiPheLysGlylleAsnArgGlnLe 
AspArgGlyTrpArgGlySerThrAspIleThrlleGlnArgGlyAspAspValPheLysGluAsnTyrVal 
15 ThrLeuProGlyThrPheAspGluCVsTyrProThrTyrLeuTyrGlnLysIle^ 

AlaPheThrArgiyrGlnLeuArgGlyTyrlleGluAspSerGlxiAspLeuGluIleTyrLeuIleArgT^ 
AsnAlaLysHisGluThrValAsnValProGlyThrGlySerLeuTrpProLeuSerAlaGlnSerProIle 
GlyLysCysGlyGluProAsnArgCysAlaProHisLeuGluTrpAsnProAspLe\iAspCysSerC^ 
AspGlyGluLysCysAlaHisHisSerHisHisPheSerLeuAspIleAspValGlyCysThrAspLeuAsn 
20 GltiAspLeuGlyValTrpValliePheLysIleLysThrGlnAspGlyHisAlaArgLeuGlyAsnLeuGl 
PheLeuGluGluLysProLeuValGlyGliiAlaLetiAlaArgValLysArgAlaGluLysLysTrpArgAsp 
LysArgGliiLysLeuGluTrpGluThrAsnlleValTyrLysGlxiAlaLysGluSerValAspAlaLeuPhe 
ValAsnSerGlnTyrAspGlnLeuGlnAlaAspThrAsnlleAlaMetlleHisAlaAlaAspLysArgVal 
HisSerlleArgGliaAlaTyrLeixProGluLeuSerVallleProGlyValAsnAlaAlallePhe^ 
25 LeuGluGlyArgllePheThrAlaPheSerLeuTyrAspAlaArgAsnVallleLysAsnGlyAspPheAsn 
AsnGlyLeuSerCysTrpAsnValLysGlyHisValAspValGluGluGlnAsnAsnGlnAifgSer^^ 
ValValProGluTrpGluAlaGluValSerGlnGluValArgValCysProGlyArgGlyTyrlleLeiiAr^ 
ValThrAlaTyrLysGluGlyTyrGlyGluGlyCysValThrlleHisGluIleGluAsnAsnThrAspGlu 
LeuLysPheSerAsnCysValGluGluGluIleTyrProAsnAsnThrValThrCysAsnAspT^ 
30 AsnGlnGluGluTyrGlyGlyAlaTyrThrSerArgAsnArgGlyTyrAsnGluAlaProSerValProAla 
AspTyrAlaSerValTyrGluGluLysSerTyrThrAspGlyArgArgGliiAsnProCVsGluPheAsr^ 
GlyTyrArgAspTyrThrProI^uProValGlyTyrValThrLysGluLeuGluTyrPheProGluT^^ 
LysValTrpIleGluIleGlyGluThrGluGlyThrPhelleValAspSerValGluLeuLeuLeuMetGlu 

Glu 

35 
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6.6.6 Amiino AciD Sequence of the EGl 1091 Crystal Protein (SEQ ID NO:30) 

MetAspAsnAsnProAsnlleAsnGluCysIleProTyrAsnCysLeuSerAsnProGluValGluValLeu 
GlyGlyGluArglleGluThrGlyTyrThrProIleAspIleSerLeuSerLeuThrGlnPheLeuLeuSer 
GluPheValProGlyAlaGlyPheValLeuGlyLeuValAspIlelleTrpGlyllePheGlyProSerGln 
5 TrpAspAlaPheLeuValGlnlleGluGlnLeuIleAsnGlnArglleGluGluPheAlaArgAsnGlnAla 
IleSerArgLeuGluGlyLeuSerAsnLeuTyrGlnlleTyrAlaGluSerPheArgGluTrpGluAlaAsp 
ProThrAsnProAlaLexiArgGluGluMetArglleGlnPheAsnAspMetAsnSerAlaLeuThrThrAla 
I leProLeuPheAlaValGlnAsnTyrGlnVal ProLeuLeuSerValTyrValGlnAlaAlaAsnLeuHi s 
LeuSerValLeuArgAspValSerValPheGlyGlnArgTrpGlyPheAspAlaAlaThrlleAsnSerArg 

1 0 TyrAsnAspI^uThrArgLeuIleGlyAsnTyxThrAspT^ 

ArgValTrpGlyProAspSerArgAspTrpValArgTyrAsnGlnPheArgArgGliiLeuThrLeuThrVal 
LeuAspIleValAlaLeuPheProAsnTyrAspSerArgArgTyrProIleArgThrValSerGlnLeuThr 
ArgGluIleTyrThrAsnProValLeuGluAsnPheAspGlySerPheArgGlySerAlaGlnQlylleGlu 
ArgSerlleArgSerProHisLexiMetAspIleLeuAsnSerlleThrlleTyrThrAspAlaHisArgGly 

15 TyrTyrTyrTrpSerGlyHisGlnlleMetAlaSerProValGlyPheSerGlyProGluPheThrPh^ 

LeuTyrGlyThrMetGlyAsnAlaAlaProGlnGlnArglleValAlaGlnLeuGlyGlnGlyValTyrArg 
ThrLeuSerSerThrLeuTyrArgArgProPheAsnlleGlylleAsiiAsnGlnGlnlieuSerValLeuAsp 
GlyThrGluPheAlaTyrGlyThrSerSerAsnLeuProSerAlaValTyrArgLysSerGlyThrValAsp 
SerLexxAspGluIleProProGlnAsnAsnAsnValProProArgGlnGlyPheSerHisArgLeuSerHis 

20 ValSerMetPheArgSerGlyPheSerAsnSerSerValSerllelleArgAlaProMetPheSerTrpIle 
HisArgSerAlaThrLeuThrAsnThrlleAspProGluArglleAsnGlnlleProLeuValLysGlyPhe 
ArgValTrpGlyGlyThrSerVallleThrGlyProGlyPheThrGlyGlyAspIleLexiArgArgAsnThr 
PheGlyAspPheValSerLeuGlnValAsnlleAsnSerProIleThrGlnArgTyrArgLeuArgPheArg 
TyrAlaSerSerArgAspAlaArgVallleValLeuThrGlyAlaAlaSerThrGlyValGlyGlyGlnVal 

25 serValAsnMetProLeuGlnLysThrMetGluIleGlyGluAsnLeuThrSerArgThrPhe^ 

AspPheSerAsnProPheSerPheArgAlaAsnProAspIleXleGlylleSerGluGlnProLeuPheGly 
AlaGlySerlleSerSerQlyGluLeuTyrlleAspLysIleGluIlelleLeuAlaAspAlaThrPheGlu 
AlaGluSerAspLeuGluAz^AlaGlziLysAlaValAsnAlaLeuPheThrSerSerAsnGlnXleGlyLeu 
LysThrAspValThrAspTyrHisIleAspGlnValSerAsnLeuValAspCysLeuSerAspGluPheCys 

30 LexAspGliiLysArgGluLeiiSerGluLysValLysHisAlaLysArgLeuSerAspGluArgAsriLeuLeu 
GlnAspProAsnPheArgGlylleAsnArgGlnProAspArgGlyTrpArgGlySerThrAspIleThrlle 
GlnGlyGlyAspAspValPheLysGluAsnTyrValThrLeuProGlyThrValAspGluCysTyrProThr 
lyrLeuTyrGlnLysIleAspGluSerLysLeiiLysAlaTyrThrArgTyrGliiI^xiArgGly 
AspSerGlnAspLeuGluIleTyrLeuIleArgTyrAsnAlaLysHisGluIleValAsnValProGlyThr 

35 GlySerLeuTrpProLeuSerAlaGlnSerProIleGlyLysCysGlyGluProAsnArgCysAlaProHis 

-87- 

A 10S77^29MBOII.DOO 



J 



LeuGluTrpAsnProAspLeuAspCysSerCysAirgAspGlyGluLysCysAlaHisHisSerHisHisPhe 
ThrLeuAspIleAspValGlyCysThrAspLeiJtAsnGluAspLeuGlyValTrpValllePheLysIleLys 
ThrGlnAspGlyHisAlaArgLeuGlyAsnLeuGIuPheljeuGluGluLysProLeuLeuGlyGluAlaLeu 
AlaArgValLysArgAlaGluLysLysTrpArgAspLysArgGluLysLeuGlnLeuGluThrAsnlleVal 
5 TyrLysGluAlaLysGluSerValAspAlaLeuPheValAsnSerGlnTyrAspArgLeuGlnValAspThr 
AsnlleAlaMetlleHisAlaAlaAspLysArgValHisArglleArgGluAlaTyrl^uProGluLeuSer 
VallleProGlyValAsnAlaAlallePheGluGluLeuGluGlyArgllePheThrAlaTyrSerLeuTyr 
AspAlaArgAsnVallleLysAsnGlyAspPheAsnAsnGlyLeuLeuCysTrpAsnValLysGlyHisVal 
AspValGluGluGlnAsnAsnHisArgSerValLeuVallleProGluTrpGluAlaGluValSerGlnGlu 

10 VaXArgValCysProGlyArgGlyTyrlleLeuArgValThrAlaTyrLysGluGlyTyrGlyGluGlyC^ 
ValThrlleHisGluIleGluAspAsnThrAspGlxxLeuLysPheSerAsnCysValGluGluGluVall^ 
ProAsnAsnThrValThrCVsAsnAsnTyrThrGlyThrGlnGluGluTyrGluGl 
A^nGlnGlyTyrAspGluAlaTyrGlyAsnAsnProSerValProAlaAspTyrAlaSerValTy^ 
LysSerTyrThrAspGlyArgArgGluAsnProCysGluSerAsnArgGlyTyrGlyAspTyrThrProLeu 

15 ProAlaGlyTyrValThrLysAspLeuGluTyrPheProGluThrAspLysValTrpIleGluIleGlyGlu 
ThrGluGlyThr Phe I leValAspSerValGluLeiiLeuLeuMe tGluGlu 

6,6.7 AMINO ACID Sequence of the EG11768 Crystal Protein (SEQ ID NO:34) 

MetAspAsnAsnProAsnlleAsnGluCysIleProTyrAsnCysLeuSerAsnProGluValGluValLeu 

20 GlyGlyGluArgIleGluThrGlyTyrThrProIleAspIleSerLeuSerLeuThrGlnPheLeuI*euSer 
GluPheValProGlyAlaGlyPheValLeuGlyLeuValAspIlelleTrpGlyllePheGlyProSerGln 
TrpAspAlaPheLeuValGlnlleGluGlnLeuIleAsnGlnArglleGluGluPheAlsArgAsnGlnAla 
IleSerArgLeuGluGlyLeuSerAsnLeuTyrGlnlleTyrAlaGlxxSerPheArgGluTrpGluAlaAsp 
ProThrAsxiProAlaLeiiArgGluGluMetJVrgIleGlnPheAsnAspMetJ^xiSerAlaI#eu^ 

25 iieProLeuPheAlaValGInAsnTyrGlnValProIieuLeuSerValTyrValGlziAlaAlaAsxiLei^ 

LeuSerValLeuArgAspValSerValPheGlyGlxiArgTrpGlyPheAspAIaAlaThrlleAsnSerArg 
TyrAsnAspI-euThrArgLeuIleGlyAsnTyrThrAspTyrAlaValArgTrpT^ 
ArgValTrpGlyProAspSerArgAspTrpValArgTyrAsnGlnPheArgArgGlxiLeuThrLeuTl^ 
LeuAspIleValAlaLeuPheProAsnTyrAspSerArgArgTyrProIleArgThrValSerGlnLeuTh 

30 ArgGluIleTyrThrAsnProValLeuGluAsnPheAspGlySerPheArgGlySerAlaGlnGlylleGlu 
ArgSerlleArgSerProHisLeuMetAspIleLeuAsnSerlleThrlleTyrThrAspAlaHisArgGly 
TyrTyrTyrTrpSerGlyHisGlnlleMetAlaSerProValGlyPheSerGlyProGluPheThrPhePro 
LeuTyrGlyThrMetGlyAsnAlaMaProGlnGlnArglleValAlaGlnLeuGlyGlnGlyValTyr;^ 
ThrLeuSerSerThrLeuTyr ArgArgProPheAsnl leGly I leAsoAsnGlnGlnlieuSerVal LeuAsp 

35 GlyThrGluPheAlaTyrGlyThrSerSerAsnLeuProSerAlaValTyrArgLysSerGlyThrValAsp 
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SerLexiAspGluIleProProGlnAsnAsnAsnValProProArgGlnGlyPheSerHisArgLeuSerHis 

ValSerMetPheArgSerGlyPheSerAsnSerSerValSerllelleArgAlaProMetPheSerTrpIle 

HisArgSerAlaGluPheAsnAsnllelleAlaSerAspSerlleThrGlnlleProLeuValLysAlaHis 

ThrLeuGlnSertSlyThrThrValValArgGlyProGlyPheThrGlyGlyAspIleLeuArBArgThr^ 

GlyGlyProPheAlaTyrThrlleValAsnlleAsnGlyGlnLeuProGlnArgTyrArgAlaArglleArg 

TyrAlaSerThrThrAsnLeuArglleTyrValThrValAlaGlyGluArgllePheAlaGlyGlnPheAsn 

LysThrMetAspThrGlyAspProLeuThrPheGlnSerPheSerTyrAlaThrlleAsnThrAlaPheThr 

PheProMetSerGlxiSerSerPheThrValGlyAlaAspThrPheSerSerGlyAsnGluValTyrlle^ 

ArgPheGliiLeuIleProValThrAlaThrLeuGluAlaGluTyrAsnLeuGluArgAlaGlnLysAlaVal 

AsnAlaLeuPheThrSerThrAsnGlnLeuGlyLeuLysThrAsnValThrAspTyrHisIleAspGlnVal 

SerAsnteuValThrTyrLeuSerAspGluPheCysLeuAspGluLysA^ 

HisAlaLysArgLeuSerAspGliiArgAsriLexiLeuGlnAspSerAsnPheLysAspIleAsnArgGlnPro 

GluArgGlyTrpGlyGlySerThrGlylleThrlleGlnGlyGlyAspAspValPheLysGliiAsnTyrVal 

ThrLeuSerGlyThrPheAspGluCysTV^^ProThrTyrLeuTyrGlnLysIleAsp 

AlaPheThrArgTyrGlnLeuArgGlyTyrlleGluAspSerGlnAspLeuGluIleTyrLeuIl 

AsnAlaLysHisGluThrValAsnValProGlyThrGlySerLeuTrpProLeuSerAlaGlnSerProIle 

GlyLysCysGlyGluProAsnArgCysAlaProHisLeuGluTrpAsnProAspLetAspCysSerCysArg 

AspGlyGluLysCysAlaHisHisSerHisHisPheSerLeuAspIleAspValGlyCysThrAspLeuAsn 

GluAspLeuGlyValTrpValllePheLysIleLysThrGlnAspGlyHisAlaArgLeuGlyAsnLeuGlu 

PheLeuGluGliiLysProLeuValGlyGluAlaLeuAlaArgValLysArgAlaGluLysLysTrpArgAsp 

LysArgGluLysLeuGluTrpGluThrAsnlleValTyrLysGluAlaLysGluSerValAspAlaLeuPhe 

ValAsnSerGlnTyrAspGlraeuGlnAlaAspThrAsnlleAlaMetlleHisAlaAlaAspLysArgVa 

HisSerlleArgGluAlaTyrLeuProGlxU-exiSerVallleProGlyValAsnAlaAlallePheGluGlu 

LeuGluGlyArgllePheThrAlaPheSerLeuTyr AspAlaArgAsnVal 1 1 eLysAsnGlyAspPheAsn 

AsnGlyLeuSerCysTrpAsnValLysGlyHisValAspValGluGluGlnAsnAsnGlnArgSerValLeu 

ValValProGluTrpGluAlaGluValSerGlnGluValArgValCysProGlyArgGlyTyrlleLeuArg 

ValThrAlaTyrLysGluGlyTyrGlyGluGlyCysValThrlleHisGluIleGluAsnAsnThrAspGlu 

I^xiLysPheSerAsnCysValGluGluGluIleTyrProAsnAsnThrValThrCysAsnAspTyrTh^ 

AsnGlnGluGluTyrGXyGlyAlaTyrThrSerArgAsnArgGlyTyrAsnGluMaProSerVa 

AspTyrAlaSerValTyrGluGluLysSerTyrThrAspGlyArgArgGluAsnProCysGluPheAsn^ 

GlyTyrArgAspTyrThrProLeuProValGlyTyrValThrLysGlul^uGluTyrPheProGluTh^ 

LysValTrpIleGluIleGlyGluThrGluGlyThrPhelleValAspSerValGluLeuLeuLeuMetGlu 

Glu 
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6.7 Example?- DNA Sequences Encoding the Novel Crystal Proteins 
6.7. 1 DNA Sequence Encoding the EG 1 1063 Crystal Protein (SEQ ID NO:9) 



ATG 


GAT 


AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 


48 


AGT 


AAC 


CCT GAA GTA GAA GTA TTA GGT GGA GAA AGA ATA GAA ACT GGT 


96 


TAG 


ACC 


CCA ATC GAT ATT TCC TTG TCG CTA ACG CAA TTT CTT TTG AGT 


144 


GAA 


TTT 


GTT CCC GGT GCT GGA TTT GTG TTA GGA CTA GTT GAT ATA ATA 


192 


TGG 


GGA 


ATT TTT GGT CCC TCT CAA TGG GAC GCA TTT CTT GTA CAA ATT 


240 


GAA 


CAG 


TTA ATT AAC CAA AGA ATA GAA GAA TTC GCT AGG AAC CAA GCC 


288 


ATT 


TCT 


AGA TTA GAA GGA CTA AGC AAT CTT TAT CAA ATT TAC GCA GAA 


336 


TCT 


TTT 


AGA GAG TGG GAA GCA GAT CCT ACT AAT CCA GCA TTA AGA GAA 


384 


GAG 


ATG 


CGT ATT CAA TTC AAT GAC ATG AAC AGT GCC CTT ACA ACC GCT 


432 


ATT 


CCT 


CTT TTT GCA GTT CAA AAT TAT CAA GTT CCT CTT TTA TCA GTA 


480 


TAT 


GTT 


CAA GCT GCA AAT TTA CAT TTA TCA GTT TTG AGA GAT GTT TCA 


528 


GTG 


TTT 


GGA CAA AGG TGG GGA TTT GAT GCC GCG ACT ATC AAT AGT CGT 


576 


TAT 


AAT 


GAT TTA ACT AGG CTT ATT GGC AAC TAT ACA GAT TAT GCT GTA 


624 


CGC 


TGG 


TAC AAT ACG GGA TTA GAA CGT GTA TGG GGA CCG GAT TCT AGA 


672 


GAT 


TGG 


GTA AGG TAT AAT CAA TTT AGA AGA GAA TTA ACA CTA ACT GTA 


720 


TTA 


GAT 


ATC GTT GCT CTG TTC CCG AAT TAT GAT AGT AGA AGA TAT CCA 


768 


ATT 


CGA 


ACA GTT TCC CAA TTA ACA AGA GAA ATT TAT ACA AAC CCA GTA 


816 


TTA 


GAA 


AAT TTT GAT GGT AGT TTT CGA GGC TCG GCT CAG GGC ATA GAA 


864 


AGA 


AGT 


ATT AGG AGT CCA CAT TTG ATG GAT ATA CTT AAC AGT ATA ACC 


912 


ATC 


TAT 


ACG GAT GCT CAT AGG GGT TAT TAT TAT TGG TCA GGG CAT CAA 


960 


ATA 


ATG 


GCT TCT CCT GTA GGG TTT TCG GGG CCA GAA TTC ACT TTT CCG 


1008 


CTA 


TAT 


GGA ACT ATG GGA AAT GCA GCT CCA CAA CAA CGT ATT GTT GCT 


1056 


CAA CTA GOT CAG (JGC GTG TAT AGA ACA TTA TCG TCC ACT TTA TAT AGA 


1104 


AGA 


CCT 


TTT AAT ATA GGG ATA AAT AAT CAA CAA CTA TCT GTT CTT GAC 


1152 


GGG 


ACA 


GAA TTT GCT TAT GGA ACC TCC TCA AAT TTG CCA TCC GCT GTA 


1200 


TAC 


AGA AAA AfiC GGA ACG GTA GAT TCG CTG GAT GAA ATA CCG CCA CAG 


1248 


AAT 


AAC 


AAC GTG CCA CCT AGG CAA GGA TTT AGT CAT CGA TTA AGC CAT 


1296 


GTT 


TCA ATG TTT CGT TCA GGC TTT AGT AAT AGT AGT GTA AGT ATA ATA 


1344 


AGA 


GCT 


CCA ATG TTT TCT TGG ACG CAC CGT AGT GCA ACC CCT ACA AAT 


1392 


ACA 


ATT 


GAT CCG GAG AGG ATT ACT CAA ATA CCA TTG GTA AAA GCA CAT 


1440 


ACA 


GTT 


CAG TCA GGT ACT ACT GTT GTA AGA GGG CCC GGG TTT ACG GGA 


1488 


GGA 


GAT 


ATT CTT CGA CGA ACA AGT GGA GGA CCA TTT GCT TAT ACT ATT 


1536 


GTT 


AAT 


ATA AAT GGG CAA TTA CCC CAA AGG TAT CGT GCA AGA ATA CGC 


1584 
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TAT GCC TCT ACT ACA AAT CTA AGA 
CGG ATT TTT GCT GGT CAA TTT AAC 
TTA ACA TTC CAA TCT TTT AGT TAC 
TTC CCA ATG AGC CAG AGT AGT TTC 
5 TCA GGG AAT GAA GTT TAT ATA GAC 

GCA ACA TTT GAA GCA GAA TAT GAT 
AAT GCG CTG TTT ACT TCT ATA AAC 
ACG GAT TAT CAT ATT GAT CAA GTA 
GAT GAA TTT TGT CTG GAT GAA AAG 
10 CAT GCG AAG CGA CTC AGT GAT GAG 

TTC AAA GGC ATC AAT AGG CAA CTA 
GAT ATT ACC ATC CAA AGA GGA GAT 
ACA CTA CCA GGT ACC TTT GAT GAG 
AAA ATC GAT GAA TCA AAA TTA AAA 
15 GGG TAT ATC GAA GAT AGT CAA GAC 

AAT GCA AAA CAT GAA ACA GTA AAT 
CCG CTT TCA GCC CAA AGT CCA ATC 
TGC GCG CCA CAC CTT GAA TGG AAT 
GAT GGA GAA AAG TGT GCC CAT CAT 
20 GAT GTA GGA TGT ACA GAC TTA AAT 
TTT AAG ATT AAG ACG CAA GAT GGG 
TTT CTC GAA GAG AAA CCA TTA GTA 
AGA GCG GAG AAA AAA TGG AGA GAC 
ACA AAT ATC GTT TAT AAA GAG GCA 
25 GTA AAC TCT CAA TAT GAT CAA TTA 

ATT CAT GCG GCA' GAT AAA CGT GTT 
CCT GAG CTG TCT GTG ATT CCG GGT 
TTA GAA GGG CGT ATT TTC ACT GCA 
GTC ATT AAA AAT GGT GAT TTT AAT 
30 AAA GGG CAT GTA GAT GTA GAA GAA 

GTT GTT CCG GAA TGG GAA GCA GAA 
CCG GGT CGT GGC TAT ATC CTT CGT 
GGA GAA GGT TGC GTA ACC ATT CAT 
CTG AAG TTT AGC AAC TGC GTA GAA 
35 GTA ACG TGT AAT GAT TAT ACT GTA 

TAC ACT TCT CGT AAT CGA GGA TAT 



ATT TAC GTA ACG GTT GCA GGT GAA 1632 

AAA ACA ATG GAT ACC GGT GAC CCA X680 

GCA ACT ATT AAT ACA GCT TTT ACA 1728 

ACA GTA GGT GCT GAT ACT TTT AGT 1776 

AGA TTT GAA TTG ATT CCA GTT ACT 1824 

TTA GAA AGA GCA CAA AAG GCG GTG 1872 

CAA ATA GGG ATA AAA ACA GAT GTG 1920 

TCC AAT TTA GTG GAT TGT TTA TCA 1968 

CGA GAA TTG TCC GAG AAA GTC AAA 2016 

CGG AAT TTA CTT CAA GAT CCA AAC 2064 

GAC CGT 6GT TGG AGA GGA AGT ACG 2112 

GAC GTA TTC AAA GAA AAT TAT GTC 2160 

TGC TAT CCA ACA TAT TTG TAT CAA 2208 

GCC TTT ACC CGT TAT CAA TTA AGA 2256 

TTA' GAA ATC TAT TTA ATT CGC TAC 2304 

GTG CCA GGT ACG GGT TCC TTA TGG 2352 

GGA AAG TGT GGA GAG CCG AAT CGA 2400 

CCT GAC TTA GAT TGT TCG TGT AGG 2448 

TCG CAT CAT TTC TCC TTA GAC ATT 2496 

GAG GAC CTA GGT GTA TGG GTG ATC 2544 

CAC GCA AGA CTA GGG AAT CTA GAG 2592 

GGA GAA GCG CTA GCT CGT GTG AAA 2640 

AAA CGT GAA AAA TTG GAA TGG GAA 2688 

AAA GAA TCT GTA GAT GCT TTA TTT 2736 

CAA GCG GAT ACG AAT ATT GCC ATG 2784 

CAT AGC ATT CGA GAA GCT TAT CTG 2832 

GTC AAT GCG GCT ATT TTT GAA GAA 2880 

TTC TCC CTA TAT GAT GCG AGA AAT 2928 

AAT GGC TTA TCC TGC TGG AAC GTG 2976 

CAA AAC AAC CAA CGT TCG GTC CTT 3024 

GTG TCA CAA GAA GTT CGT GTC TGT 3072 

GTC ACA GCG TAC AAG GAG GGA TAT 3120 

GAG ATC GAG AAC AAT ACA GAC GAA 3168 

GAG GAA ATC TAT CCA AAT AAC ACG 3216 

AAT CAA GAA GAA TAC GGA GGT GCG 3264 

AAC GAA GCT CCT TCC GTA CCA GCT 3312 
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GAT TAT GCG TCA GTC TAT GAA GAA AAA TCG TAT ACA GAT GGA CGA AGA 3360 

GAG AAT CCT TGT GAA TTT AAC AGA GGG TAT AGG GAT TAG ACG CCA CTA 3408 

CCA GTT GGT TAT GTG ACA AAA GAA TTA GAA TAG TTC CCA GAA ACC GAT 3456 

AAG GTA TGG ATT GAG ATT GGA GAA ACG GAA GGA ACA TTT ATC GTG GAC 3504 

AGC GTG GAA TTA CTC CTT ATG GAG GAA 3531 

6.7.2 Dna Sequence Encoding The EG11074 Crystal Protein (SEQ ID NO:ll) 

ATG GAT AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 48 
AGT AAC CCT GAA GTA GAA GTA TTA GGT GGA GAA AGA ATA GAA ACT GGT 96 
TAG ACC CCA ATC GAT ATT TCC TTG TCG CTA ACG CAA TTT CTT TTG AGT 144 
GAA TTT GTT CCC GGT GCT GGA TTT GTG TTA GGA CTA GTT GAT ATA ATA 192 
TGG GGA ATT TTT GGT CCC TCT CAA TGG GAC GCA TTT CTT GTA CAA ATT 240 
GAA GAG TTA ATT AAC CAA AGA ATA GAA GAA TTC GCT AGG AAC CAA GCC 288 
ATT TCT AGA TTA GAA GGA CTA AGC AAT CTT TAT CAA ATT TAC GCA GAA 336 
TCT TTT AGA GAG TGG GAA GCA GAT CCT ACT AAT CCA GCA TTA AGA GAA 384 
GAG ATG CGT ATT CAA TTC AAT GAC ATG AAC AGT GCC CTT ACA ACC GCT 432 
ATT CCT CTT TTT GCA GTT CAA AAT TAT CAA GTT CCT CTT TTA TCA GTA 480 . 
TAT GTT CAA GCT GCA AAT TTA CAT TTA TCA GTT TTG AGA GAT GTT TCA 528 
GTG TTT GGA CAA AGG TGG GGA TTT GAT GCC GCG ACT ATC AAT AGT CGT 576 
TAT AAT GAT TTA ACT AGG CTT ATT GGC AAC TAT ACA GAT TAT GCT GTA 624 
CGC TGG TAC AAT ACG GGA TTA GAA CGT GTA TGG GGA CCG GAT TCT AGA 672 
GAT TGG GTA AGG TAT AAT CAA TTT AGA AGA GAA TTA ACA CTA ACT GTA 720 
TTA GAT ATC GTT GCT CTG TTC CCG AAT TAT GAT AGT AGA AGA TAT CCA 768 
ATT CGA ACA GTT TCC CAA TTA ACA AGA GAA ATT TAT ACA AAC CCA GTA 816 
TTA GAA AAT TTT GAT GGT AGT TTT CGA GGC TCG GCT CAG GGC ATA GAA 864 
AGA AGT ATT AGG AGT CCA CAT TTG ATG GAT ATA CTT AAC AGT ATA ACC 912 
ATC TAT ACG GAT GCT CAT AGG GGT TAT TAT TAT TGG TCA GGG CAT CAA 960 
ATA ATG GCT TCT CCT GTA GGG TTT TCG GGG CCA GAA TTC ACT TTT CCG 1008 
CTA TAT GGA ACT ATG GGA AAT GCA GCT CCA CAA CAA CGT ATT GTT GCT 1056 
CAA CTA GGT CAG GGC GTG TAT AGA ACA TTA TCG TCC ACT TTA TAT AGA 1104 
AGA CCT TTT AAT ATA GGG ATA AAT AAT CAA CAA CTA TCT GTT CTT GAC 1152 
GGG ACA GAA TTT GCT TAT GGA ACC TCC TCA AAT TTG CCA TCC GCT GTA 1200 
TAC AGA AAA AGC GGA ACG GTA GAT TCG CTG GAT GAA ATA CCG CCA CAG 1248 
AAT AAC AAC GTG CCA CCT AGG CAA GGA TTT AGT CAT CGA TTA AGC CAT 1296 
GTT TCA ATG TTT CGT TCA GGC TTT AGT AAT AGT AGT GTA AGT ATA ATA 1344 
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AGA GCT CCA ATG TTT TCT TGG ACG 
ACA ATT GAT CCG GAG AGG ATT ACT 
ACA CTT CAG TCA GGT ACT ACT GTT 
GGA GAT ATT CTT CGA CGA ACA AGT 
5 GTT AAT ATA AAT GGG CAA TTA CCC 

TAT GCC TCT ACT ACA AAT CTA AGA 
CGG ATT TTT GCT GGT CAA TTT AAC 
TTA ACA TTC CAA TCT TTT AGT TAC 
TTC CCA ATG AGC CAG AGT AGT TTC 
10 TCA GGG AAT GAA GTT TAT ATA GAC 

. GCA ACA CTC . GAG GCT GAA TAT AAT 
AAT GCG CTG TTT ACG TCT ACA AAC 
ACG GAT TAT CAT ATT GAT CAA GTG 
GAT GAA TTT TGT CTG GAT GAA AAG 
IS CAT GCG AAG CGA CTC AGT GAT GAA 

TTC AAA GAC ATT AAT AGG CAA CCA 
GGG ATT ACC ATC CAA GGA GGG GAT 
ACA CTA TCA GGT ACC TTT GAT GAG 
AAA ATC GAT GAA TCA AAA TTA AAA 
20 GGG TAT ATC GAA GAT AGT CAA GAC 
AAT GCA AAA CAT GAA ACA GTA AAT 
CCG CTT TCA GCC CAA AGT CCA ATC 
TGC GCG CCA CAC CTT GAA TGG AAT 
GAT GGA GAA AAG TGT GCC CAT CAT 
25 GAT GTA GGA TGT ACA GAC TTA AAT 

TTT AAG ATT AAG ACG CAA GAT GGG 
TTT CTC GAA GAG AAA CCA TTA GTA 
AGA GGG GAG AAA AAA TGG AGA GAC 
ACA AAT ATC GTT TAT AAA GAG GCA 
30 GTA AAC TCT CAA TAT GAT CAA TTA 

ATT CAT GCG GCA GAT AAA CGT GTT 
CCT GAG CTG TCT GTG ATT CCG GGT 
TTA GAA GGG CGT ATT TTC ACT GCA 
GTC ATT AAA AAT GGT GAT TTT AAT 
35 AAA GGG CAT GTA GAT GTA GAA GAA 

GTT GTT CCG GAA TGG GAA GCA GAA 



CAC CGT AGT GCA ACC CCT ACA AAT 1392 

CAA ATA CCA TTG GTA AAA GCA CAT 1440 

GTA AGA GGG CCC GGG TTT ACG GGA 1488 

GGA GGA CCA TTT GCT TAT ACT ATT 1536 

CAA AGG TAT CGT GCA AGA ATA CGC 1584 

ATT TAC GTA ACG GTT GCA GGT GAA 1632 

AAA ACA ATG GAT ACC GGT GAC CCA 1680 

GCA ACT ATT AAT ACA GCT TTT ACA 1728 

ACA GTA GGT GCT GAT ACT TTT AGT 1776 

AGA TTT GAA TTG ATT CCA GTT ACT 1824 

CTG GAA AGA GCG CAG AAG GCG GTG 1872 

CAA CTA GGG CTA AAA ACA AAT GTA 1920 

TCC AAT TTA GTT ACG TAT TTA TCG 1968 

CGA GAA TTG TCC GAG AAA GTC AAA 2016 

CGC'AAT TTA CTC CAA GAT TCA AAT 2064 

GAA CGT GGG TGG GGC GGA AGT ACA 2112 

GAC GTA TTT AAA GAA AAT TAC GTC 2160 

TGC TAT CCA ACA TAT TTG TAT CAA 2208 

GCC TTT ACC CGT TAT CAA TTA AGA 2256 

TTA GAA ATC TAT TTA ATT CGC TAC 2304 

GTG CCA GGT ACG GGT TCC TTA TGG 2352 

GGA AAG TGT GGA GAG CCG AAT CGA 2400 

CCT GAC TTA GAT TGT TCG TGT AGG 2448 

TCG CAT CAT TTC TCC TTA GAC ATT 2496 

GAG GAC CTA GGT GTA TGG GTG ATC 2544 

CAC GCA AGA CTA GGG AAT CTA GAG 2592 

GGA GAA GCG CTA GCT CGT GTG AAA 2640 

AAA CGT GAA AAA TTG GAA TGG GAA 2688 

AAA GAA TCT GTA GAT GCT TTA TTT 2736 

CAA GCG GAT ACG 7JV.T ATT GCC ATG 2784 

CAT AGC ATT CGA GAA GCT TAT CTG 2832 

GTC AAT GCG GCT ATT TTT GAA GAA 2880 

TTC TCC CTA TAT GAT GCG AGA AAT 2928 
AAT GGC TTA TCC TGC TGG AAC GTG 2976 
CAA AAC AAC CAA CGT TCG GTC CTT 3024 
GTG TCA CAA GAA GTT CGT GTC TGT 3072 
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I A 

CCG GOT CGT GGC TAT ATC CTT CGT GTC ACA GCG TAG AAG GAG GGA TAT 3120 

GGA GAA GGT TGC GTA ACC ATT CAT GAG ATC GAG AAC AAT ACA GAC GAA 3168 

CTG AAG TTT AGC AAC TGC GTA GAA GAG GAA ATC TAT CCA AAT AAC ACG 3216 

GTA ACG TGT AAT GAT TAT ACT GTA AAT CAA GAA GAA TAC GGA GGT GCG 3264 

5 TAC ACT TCT CGT AAT CGA GGA TAT AAC GAA GCT CCT TCC GTA CCA GCT 3312 

GAT TAT GCG TCA GTC TAT GAA GAA AAA TCG TAT ACA GAT GGA CGA AGA 3360 

GAG AAT CCT TGT GAA TTT AAC AGA GGG TAT AGG GAT TAC ACG CCA CTA 3408 

CCA GTT GGT TAT GTG ACA AAA GAA TTA GAA TAC TTC CCA GAA ACC GAT 3456 

AAG GTA TGG ATT GAG ATT GGA GAA ACG GAA GGA ACA TTT ATC GTG GAC 3504 

10 AGC GTG GAA TTA CTC CTT ATG GAG GAA 3531 

6.7 J DNA Sequence Encoding the EGl 1735 Crystal Protein (SEQ ID NO: 13) 

ATG GAT AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 48 
AGT AAC CCT GAA GTA GAA GTA TTA GGT GGA GAA AGA ATA GAA ACT GGT 96 

15 TAC ACC CCA ATC GAT ATT TCC TTG TCG CTA ACG CAA TTT CTT TTG AGT 144 

GAA TTT GTT CCC GGT GCT GGA TTT GTG TTA GGA CTA GTT GAT ATA ATA 192 
TGG GGA ATT TTT GGT CCC TCT CAA TGG GAC GCA TTT CTT GTA CAA ATT 240 , 
GAA CAG TTA ATT AAC CAA AGA ATA GAA GAA TTC GCT AGG AAC CAA GCC 288 
ATT TCT AGA TTA GAA GGA CTA AGC AAT CTT TAT CAA ATT TAC GCA GAA 336 

20 TCT TTT AGA GAG TGG GAA GCA GAT CCT ACT AAT CCA GCA TTA AGA GAA 384 
GAG ATG CGT ATT CAA TTC AAT GAC ATG AAC AGT GCC CTT ACA ACC GCT 432 
ATT CCT CTT TTT GCA GTT CAA AAT TAT CAA GTT CCT CTT TTA TCA GTA 480 
TAT GTT CAA GCT GCA AAT TTA CAT TTA TCA GTT TTG AGA GAT GTT TCA 528 
GTG TTT GGA CAA AGG TGG GGA TTT GAT GCC GCG ACT ATC AAT AGT CGT 576 

25 TAT AAT GAT TTA ACT AGG CTT ATT GGC AAC TAT ACA GAT CAT GCT GTA 624 

CGC TGG TAC AAT ACG GGA TTA GAG CGT GTA TGG GGA CCG GAT TCT ACSA 672 
GAT TGG ATA AGA TAT AAT CAA TTT AGA AGA GAA TTA ACA CTA ACT GTA 720 
TTA GAT ATC GTT TCT CTA TTT CCG AAC TAT GAT AGT AGA ACG TAT CCA 768 
ATT CGA ACA GTT TCC CAA TTA ACA AGA GAA ATT TAT ACA AAC CCA GTA 816 

30 TTA GAA AAT TTT GAT GGT AGT TTT CGA GGC TCG GCT CAG GGC ATA GAA 864 

GGA AGT ATT AGG AGT CCA CAT TTG ATG GAT ATA CTT AAC AGT ATA ACC 912 
ATC TAT ACG GAT GCT CAT AGA GGA GAA TAT TAT TGG TCA GGG CAT CAA 960 

ATA ATG GCT TCT CCT GTA GGG TTT TCG GGG CCA GAA TTC ACT TTT CCG 1008 

CTA TAT GGA ACT ATG GGA AAT GCA GCT CCA CAA CAA CGT ATT GTT GCT 1056 

35 CAA CTA GGT CAG GGC GTG TAT AGA ACA TTA TCG TCC ACT TTA TAT AGA 1104 



A IOS77g(29MBOI! OOC) 



-94- 




AGA CCT TTT AAT ATA GGG ATA AAT 
. GGG ACA GAA TTT GCT TAT GGA ACC 
TAC AGA AAA AGC GGA ACG GTA GAT 
AAT AAC AAC GTG CCA CCT AGG CAA 
5 GTT TCA ATG TTT CGT TCA GGC TTT 

AGA GCT CCA ATG TTT TCT TGG ACG 
ACA ATT GAT CCG GAG AGG ATT ACT 
ACA CTT CAG TCA GGT ACT ACT GTT 
GGA GAT ATT CTT CGA CGA ACA AGT 
10 GTT AAT ATA AAT GGG CAA TTA CCC 

TAT GCC TCT ACT ACA AAT ' CTA AGA 
CGG ATT TTT GCT GGT CAA TTT AAC 
TTA ACA TTC CAA TCT TTT AGT TAC 
TTC CCA ATG AGC CAG AGT AGT TTC 
15 TCA GGG AAT GAA GTT TAT ATA GAC 

GCA ACA TTT GAA GCA GAA TAT GAT 
AAT GCG CTG TTT ACT TCT ATA AAC 
ACG GAT TAT CAT ATT GAT CAA GTA 
GAT GAA TTT TGT CTG GAT GAA AAG 
20 CAT GCG AAG CGA CTC AGT GAT GAG 

TTC AAA GGC ATC AAT AGG CAA CTA 
GAT ATT ACC ATC CAA AGA GGA GAT 
ACA CTA CCA GGT ACC TTT GAT GAG 
AAA ATC GAT GAA TCA AAA TTA AAA 
25 GGG TAT ATC GAA GAT AGT CAA GAC 

AAT GCA AAA CAT GAA ACA GTA AAT 
CCG CTT TCA GCC CAA AGT CCA* ATC 
TGC GCa CCA CAC CTT GAA TGG AAT 
GAT GGl^ GAA AAG TGT GCC CAT CAT 
30 GAT Gift GGA TGT ACA GAC TTA AAT 

TTT AAG ATT AAG ACG CAA GAT GGG 
TTT CTC GAA GAG AAA CCA TTA GTA 
AGA GCG GAG AAA AAA TGG AGA GAC 
ACA AAT ATC GTT TAT AAA GAG GCA 
35 GTA AAC TCT CAA TAT GAT CAA TTA 

ATT CAT GCG GCA GAT AAA CGT GTT 



AAT CAA CAA CTA TCT GTT CTT GAC 1152 

TCC TCA AAT TTG CCA TCC GCT GTA 1200 

TCG CTG GAT GAA ATA CCG CCA CAG 1248 

GGA TTT AGT CAT CGA TTA AGC CAT 1296 

AGT AAT AGT AGT GTA AGT ATA ATA 1344 

CAC CGT AGT GCA ACC CCT ACA AAT 1392 

CAA ATA CCA TTG GTA AAA GCA CAT 1440 

GTA AGA GGG CCC GGG TTT ACG GGA 1488 

GGA GGA CCA TTT GCT TAT ACT ATT 1536 

CAA AGG TAT CGT GCA AGA ATA CGC 1584 

ATT TAC GTA ACG GTT GCA GGT GAA 1632 

AAA ACA ATG GAT ACC GGT GAC CCA 1680 

GCA ACT ATT AAT ACA GCT TTT ACA 1728 

ACA GTA GGT GCT GAT ACT TTT AGT 1776 

AGA 'TTT GAA TTG ATT CCA GTT ACT 1824 

TTA GAA AGA GCA CAA AAG GCG GTG 1872 

CAA ATA GGG ATA AAA ACA GAT GTG 1920 

TCC AAT TTA GTG GAT TGT TTA TCA 1968 

CGA GAA TTG TCC GAG AAA GTC AAA 2016 

CGG AAT TTA CTT CAA GAT CCA AAC 2064 

GAC CGT GGT TGG AGA GGA AGT ACG 2112 

GAC GTA TTC AAA GAA AAT TAT GTC 2160 

TGC TAT CCA ACA TAT TTG TAT CAA 2208 

GCC TTT ACC CGT TAT CAA TTA AGA 2256 

TTA GAA ATC TAT TTA ATT CGC TAC 2304 

GTG CCA GGT ACG GGT TCC TTA TGG 2352 

GGA AAG TGT GGA GAG CCG AAT CGA 2400 

CCT GAC TTA GAT TGT TCG TGT AGG 2448 

TCG CAT CAT TTC TCC TTA GAC ATT 2496 

GAG GAC CTA GGT GTA TGG GTG ATC 2544 

CAC GCA AGA CTA GGG AAT CTA GAG 2592 

GGA GAA GCG CTA GCT CGT GTG AAA 2640 

AAA CGT GAA AAA TTG GAA TGG GAA 2688 

AAA GAA TCT GTA GAT GCT TTA TTT 2736 

CAA GCG GAT ACG AAT ATT GCC ATG 2784 

CAT AGC ATT CGA GAA GCT TAT CTG 2832 
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A: I05779(29MB0I! DOC) 



CCT 


GAG 


CTG 


TCT 


GTG 


ATT 


CCG 


GGT 


GTq 


AAT 


GCG 


GCT 


ATT 


TTT 


GAA 


GAA 


2880 


TTA 


GAA 


GGG 


CGT 


ATT 


TTC 


ACT 


GCA 


TTC 


TCC 


CTA 


TAT 


GAT 


GCG 


AGA 


AAT 


2928 


GTC 


ATT 


AAA 


AAT 


GGT 


GAT 


TTT 


AAT 


AAT 


GGC 


TTA 


TCC 


TGC 


TGG 


AAC 


GTG 


2976 


AAA 


GGG 


CAT 


GTA 


GAT 


GTA 


GAA 


GAA 


CAA 


AAC 


AAC 


CAA 


CGT 


TCG 


GTC 


CTT 


3024 


GTT 


GTT 


CCG 


GAA 


TGG 


GAA 


GCA 


GAA 


GTG 


TCA 


CAA 


GAA 


GTT 


CGT 


GTC 


TGT 


3072 


CCG 


GGT 


CGT 


GGC 


TAT 


ATC 


CTT 


CGT 


GTC 


ACA 


GCG 


TAC 


AAG 


GAG 


GGA 


TAT 


3120 


GGA 


GAA 


GGT 


TGC 


GTA 


ACC 


ATT 


CAT 


GAG 


ATC 


GAG 


AAC 


AAT 


ACA 


GAC 


GAA 


3168 


CTG 


AAG 


TTX 


AGC 


AAC 


TGC 


GTA 


GAA 


GAG 


GAA 


ATC 


TAT 


CCA 


AAT 


AAC 


ACG 


3216 




ACG 


TGT 


AAT 


GAT 


TAT 


ACT 


GTA 


AAT 


CAA 


GAA 


GAA 


TAC 


GGA 


GGT 


GCG 


3264 


Tar* 








AAT 


CGA 


GGA 


TAT 


AAC 


GAA 


GCT 


CCT 


TCC 


GTA 


CCA 


GCT 


3312 


GAT 


TAT 


GCG 


TCA 


GTC 


TAT 


GAA 


GAA 


AAA 


TCG 


TAT 


ACA 


GAT 


GGA 


CGA 


AGA 


'a *a f A 


GAG 


AAT 


CCT 


TGT 


GAA 


TTT 


AAC 


AGA 


GGG 


TAT 


AGG 


GAT 


TAC 


ACG 


CCA 


CTA 


3408 


CCA 


GTT 


GGT 


TAT 


GTG 


ACA 


AAA 


GAA 


TTA 


GAA 


TAG 


TTC 


CCA 


GAA 


ACC 


GAT 


3456 


AAG 


GTA 


TGG 


ATT 


GAG 


ATT 


GGA 


GAA 


ACG 


GAA 


GGA 


ACA 


TTT 


ATC 


GTG 


GAC 


3504 


AGC 


GTG 


GAA 


TTA 


CTC 


CTT 


ATG 


GAG 


GAA' 
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6.7.4 DNA Sequence Encoding the EG11092 Crystal Protein (SEQ ID NO:2S) 



ATG 


GAT 


AAC 


AAT 


CCG AAC 


ATC 


AAT 


GAA TGC 


ATT 


CCT 


TAT 


AAT TGT TTA 


48 


AGT 


AAC 


CCT 


GAA 


GTA GAA 


GTA 


TTA 


GGT 


GGA 


GAA 


AGA 


ATA 


GAA ACT GGT 


96 


TAC 


ACC 


CCA 


ATC 


GAT ATT 


TCC 


TTG 


TCG 


CTA ACG 


CAA 


TTT 


CTT TTG AGT 


144 


GAA 


TTT 


GTT 


CCC 


GGT GCT 


GGA 


TTT 


GTG 


TTA 


GGA 


CTA 


GTT 


GAT ATA ATA 


192 


TGG 


GGA ATT 


TTT 


GGT CCC 


TCT 


CAA 


TGG 


GAC 


GCA 


TTT 


CTT 


GTA CAA ATT 


240 


GAA CAG 


TTA 


ATT 


AAC CAA 


AGA 


ATA 


GAA GAA 


TTC 


GCT 


AGG 


AAC CAA GCC 


288 


ATT 


TCT 


AGA 


TTA 


GAA GGA CTA 


AGC 


AAT 


CTT 


TAT 


CAA 


ATT 


TAC GCA GAA 


336 


TCT 


TTT 


AGA 


GAG 


TGG GAA 


GCA 


GAT 


CCT 


ACT 


AAT 


CCA 


GCA TTA AGA GAA 


384 


GAG 


ATG 


CGT 


ATT 


CAA TTC 


AAT 


GAC 


ATG 


AAC 


AGT 


GCC 


CTT 


ACA ACC GCT 


432 


ATT 


CCT 


CTT 


TTT 


GCA GTT 


CAA 


AAT 


TAT 


CAA 


GTT 


CCT 


CTT 


TTA TCA GTA 


480 


TAT 


GTT 


CAA 


GCT 


GCA AAT 


TTA 


CAT 


TTA 


TCA 


GTT 


TTG 


AGA 


GAT GTT TCA 


528 


GTG 


TTT 


GGA 


CAA 


AGG TGG 


GGA 


TTT 


GAT 


GCC 


GCG 


ACT 


ATC 


AAT AGT CGT 


576 


TAT 


AAT 


GAT 


TTA 


ACT AGG 


CTT 


ATT 


GGC 


AAC 


TAT 


ACA 


GAT 


CAT GCT GTA 


624 


CGC 


TGG 


TAC 


AAT 


ACG GGA 


TTA 


GAG 


CGT 


GTA 


TGG 


GGA 


CCG 


GAT TCT AGA 


672 


GAT 


TGG 


ATA 


AGA 


TAT AAT 


CAA 


TTT 


AGA 


AGA 


GAA 


TTA 


ACA 


CTA ACT GTA 


720 


TTA 


GAT 


ATC 


GTT 


TCT CTA 


TTT 


CCG 


AAC 


TAT 


GAT 


AGT 


AGA ACG TAT CCA 


768 


ATT 


CGA 


ACA 


GTT 


TCC CAA 


TTA 


ACA AGA GAA ATT TAT ACA AAC CCA GTA 


816 


TTA 


GAA 


AAT 


TTT 


GAT GGT 


AGT 


TTT 


CGA 


GGC 


TCG 


GCT 


CAG 


GGC ATA GAA 


864 



A. IOS77«(29MB0|i.OOa 
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AGA AGT ATT AGG ACT 
ATC TAT ACG GAT GCT 
ATA ATG GCT TCT CCT 
CTA TAT GGA ACT ATG 
S CAA CTA GGT CAG GGC 

AGA CCT TTT AAT ATA 
GGG ACA GAA TTT GCT 
TAC AGA AAA AGC GGA 
AAT AAC AAC GTG CCA 

10 • GTT TCA ATG TTT CGT 
AGA GCT CCA ATG TTT 
ACA ATT GAT CCG GAG 
ACA CTT CAG TCA GGT 
GGA GAT ATT CTT CGA 

IS GTT AAT ATA AAT GGG 

TAT GCC TCT ACT ACA 
CGG ATT TTT GCT GGT 
TTA ACA TTC CAA TCT 
TTC CCA ATG AGC CAG 

20 TCA GGG AAT GAA GTT 

GCA ACA TTT GAA GCA 
AAT GCG CTG TTT ACT 
ACG GAT TAT CAT ATT 
GAT GAA TTT TGT CTG 

25 CAT GCG AAG CGA CTC 

TTC AAA GGC ATC AAT 
GAT ATT ACC ATC CAA 
ACA CTA CCA GGT ACC 
AAA ATC GAT GAA TCA 

30 GGG TAT ATC GAA GAT 
AAT GCA AAA CAT GAA 
CCG CTT TCA GCC CAA 
TGC GCG CCA CAC CTT 
GAT GGA GAA AAG TGT 

35 GAT GTA GGA TGT ACA 

TTT AAG ATT AAG ACG 

A IOS7Ta(29MB0l! OOO 



CCA CAT TTG ATG GAT ATA 
CAT AGG GGT TAT TAT TAT 
GTA GGG TTT TCG GGG CCA 
GGA AAT GCA GCT CCA CAA 
GTG TAT AGA ACA TTA TCG 
GGG ATA AAT AAT CAA CAA 
TAT GGA ACC TCC TCA AAT 
ACG GTA GAT TCG CTG GAT 
CCT AGG CAA GGA TTT AGT 
TCA GGC TTT AGT AAT AGT 
TCT TGG ACG CAC CGT AGT 
AGG ATT ACT CAA ATA CCA 
ACT ACT GTT GTA AGA GGG 
CGA ACA AGT GGA GGA CCA 
CAA TTA CCC CAA 'AGG TAT 
AAT CTA AGA ATT TAC GTA 
CAA TTT AAC AAA ACA ATG 
TTT AGT TAC GCA ACT ATT 
AGT AGT TTC ACA GTA GGT 
TAT ATA GAC AGA TTT GAA 
GAA TAT GAT TTA GAA AGA 
TCT ATA AAC CAA ATA GGG 
GAT CAA GTA TCC AAT TTA 
GAT GAA AAG CGA GAA TTG 
AGT GAT GAG CGG AAT TTA 
AGG C;^ CTA GAC CGT GGT 
AGA GGA GAT GAC GTA TTC 
TTT GAT GAG TGC TAT CCA 
AAA TTA AAA GCC TTT ACC 
AGT CAA GAC TTA GAA ATC 
ACA GTA AAT GTG CCA GGT 
AGT CCA ATC GGA AAG TGT 
GAA TGG AAT CCT GAC TTA 
GCC CAT CAT TCG CAT CAT 
GAC TTA AAT GAG GAC CTA 
CAA GAT GGG CAC GCA AGA 
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CTT AAC 


AGT ATA ACC 




TGG 


TCA 


GGG 


CAT 


CAA 




GAA 


TTC 


ACT 


TTT 


CCG 


XUUO 


CAA 


CGT 


ATT 


GTT 


GCT 




TCC 


ACT 


TTA TAT AGA 


XXU4 


CTA 


TCT 


GTT 


CTT 


GAC 


xxb^ 


TTG 


CCA 


TCC 


GCT 


GTA 


1200 


GAA 


ATA 


CCG 


CCA 


CAG 




CAT 


CGA 


TTA 


AGC 


CAT 




AGT 


GTA 


AGT 


ATA 


ATA 


1344 


GCA ACC 


CCT ACA 


AAT 


13 7^ 




GTA 


AAA 


GCA 


CAT 


1440 




GGG 


TTT 


ACG 




1488 


TTT 


GCT 


TAT 


ACT 


ATT 


1536 




GCA AGA ATA 


CRT* 


1584 




GTT 


GCA 


GGT 




1632 




ACC 


GGT 


GAC 


CCA 


1680 


AnX 


ACA 


GCT 


TTT 


APA 


1728 


GCT 


GAT 


ACT 


TTT 


AGT 


1776 


TTG 


ATT 


CCA GTT 


ACT 


1824 


GCA 


CAA 


AAG 


GCG 


GTG 


1872 


ATA 


AAA 


ACA 


GAT 


GTG 


1920 


GTG 


GAT 


TGT 


TTA 


TCA 


1968 


TCC 


GAG 


AAA 


GTC 


AAA 


2 Olo 


CTT 


CAA 


GAT 


CCA 


AAC 


20o4 


TGG 


AGA 


GGA 


AGT 


ACG 


Olio 

2112 


AAA 


GAA 


AAT 


TAT 


GTC 


2 loU 


ACA 


TAT 


TTG 


TAT 


CAA 


O •> AQ 


CGT 


TAT 


CAA 


TTA 


AGA 


22DD 


TAT 


TTA ATT 


CGC 


TAC 


2 Jl w*t 


ACG 


GGT 


TCC 


TTA 


TGG 


2 J32 


GGA 


GAG 


CCG 


AAT 


CGA 


2400 


GAT 


TGT 


TCG 


TGT 


AGG 


2448 


TTC 


TCC 


TTA 


GAC 


ATT 


2496 


GGT 


GTA 


TGG 


GTG 


ATC 


2544 


CTA 


GGG 


AAT 


CTA 


GAG 


2592 



TTT 


CTC 


GAA 


GAG 


AAA 


CCA 


TTA 


GTA 


GGA GAA 


GCG 


CTA GCT 


CGT 


GTG 


IV H 

AAA 


2640 


AGA 


GCG 


GAG 


AAA 


AAA 


TGG 


AGA 


GAC 


AAA 


CGT 


GAA 


AAA 


TTG 


GAA 


i\iG 


GAA 


2668 


ACA 


AAT 


ATC 


GTT 


TAT 


AAA 


GAG 


GCA 


AAA 


GAA 


TCT 


GTA 


GAT 




i LA 


TTT 


2736 


GTA 


AAC 


TCT 


CAA 


TAT 


GAT 


CAA 


TTA 


CAA GCG 


GAT 


ACG 


AAT 


ATT 


GCC 


ATG 


2/o4 


ATT 


CAT 


GCG 


GCA 


GAT 


AAA 


CGT 


GTT 


CAT 


AGC 


ATT 


CGA 


GAA 


GCT 


TAT 


CTG 


2o32 


CCT 


GAG 


CTG 


TCT 


GTG 


ATT 


CCG 


GGT 


GTC 


AAT 


GCG 


GCT 


ATT 


TTT 


GAA 


GAA 


2880 


TTA 


GAA 


GGG 


CGT 


ATT 


TTC 


ACT 


GCA 


TTC 


TCC 


CTA 


TAT 


GAT 


GCG 


AGA 


AAT 


2928 


GTC 


ATT 


AAA 


AAT 


GGT 


GAT 


TTT 


AAT 


AAT 


GGC 


TTA 


TCC TGC 


TGG 


AAC 


GTG 


2976 


AAA 


GGG 


CAT 


GTA 


GAT 


GTA 


GAA 


GAA 


CAA 


AAC 


AAC 


CAA 


CGT 


TCG 


GTC 


CTT 


3024 


GTT 


GTT 


CCG 


GAA 


TGG 


GAA 


GCA 


GAA 


GTG 


TCA 


CAA 


GAA 


GTT 


CGT 


GTC 


TGT 


3072 


CCG 


GGT 


CGT 


GGC 


TAT 


ATC 


CTT 


CGT 


GTC 


ACA 


6CG 


TAC 


AAG 


GAiG 


GGA 


TAT 


3120 


GGA 


GAA 


GGT 


TGC 


GTA 


ACC 


ATT 


CAT 


GAG 


ATC 


GAG 


AAC 


AAT 


ACA 


GAC 


GAA 


3168 


CTG 


AAG 


TTT 


AGC 


AAC 


TGC 


GTA 


GAA 


GAG 


GAA 


ATC 


TAT 


CCA 


AAT 


AAC 


ACG 


3216 


GTA 


ACG 


TGT 


AAT 


GAT 


TAT 


ACT 


GTA 


AAT 


CAA 


GAA 


GAA 


TAC 


GGA 


GGT 


GCG 


3264 


TAG 


ACT 


TCT 


CGT 


AAT 


CGA 


GGA 


TAT 


AAC 'GAA 


GCT 


CCT 


TCC 


GTA 


CCA 


GCT 


3312 


GAT 


TAT 


GCG 


TCA 


GTC 


TAT 


GAA 


GAA 


AAA 


TCG 


TAT 


ACA 


GAT 


GGA 


CGA 


AGA 


3360 


GAG 


AAT 


CCT 


TGT 


GAA 


TTT 


AAC 


AGA 


GGG 


TAT 


AG6 


GAT 


TAC 


ACG 


CCA 


CTA 


3408 


CCA 


GTT 


GGT 


TAT 


GTG 


ACA 


AAA GAA 


TTA GAA 


TAG 


TTC 


CCA 


GAA 


ACC 


GAT 


3456 


AAG 


GTA 


TGG 


ATT 


GAG 


ATT 


GGA 


GAA 


ACG 


GAA 


GGA 


ACA 


TTT 


ATC 


GTG 


GAC 


3504 


AGC 


GTG 


GAA 


TTA 


CTC 


CTT 


ATG 


GAG 


GAA 


TAG 














3534 



6.7.5 DNA Sequence Encoding the EG11751 Crystal Protein (SEQ ID NO:27) 



35 



ATG 


GAT 


AAC 


AAT 


CCG 


AAC 


ATC 


AAT 


GAA 


TGC 


ATT 


CCT 


TAT 


AAT 


TGT 


TTA 


48 


AGT 


AAC 


CCT 


GAA 


GTA 


GAA GTA TTA 


GGT 


GGA 


GAA 


AGA 


ATA 


GAA 


ACT 


GGT 


96 


TAC 


ACC 


CCA 


ATC 


GAT 


ATT 


TCC 


TTG 


TCG 


CTA 


ACG 


CAA 


TTT 


CTT 


TTG 


AGT 


144 


GAA 


TTT 


GTT 


CCC 


GGT 


GCT 


GGA TTT 


GTG 


TTA 


GGA 


CTA 


GTT 


GAT 


ATA 


ATA 


192 


TGG 


GGA 


ATT 


TTT 


GGT 


CCC 


TCT 


CAA 


TGG 


GAC 


GCA 


TTT 


CTT 


GTA 


CAA 


ATT 


240 


GAA 


CAO 


TTA 


ATT 


AAC 


CAA AGA ATA 


GAA 


GAA 


TTC 


GCT 


AGG 


AAC 


CAA 


GCC 


288 


ATT 


TCT 


AGA 


TTA 


GAA 


GGA 


CTA 


AGC 


AAT 


CTT 


TAT 


CAA 


ATT 


TAC 


GCA 


GAA 


336 


TCT 


TTT 


AGA 


GAG 


TGG 


GAA 


GCA 


GAT 


CCT 


ACT 


AAT 


CCA 


GCA 


TTA 


AGA 


GAA 


364 


GAG 


ATG 


CGT 


ATT 


CAA 


TTC 


AAT 


GAC 


ATG 


AAC 


AGT 


GCC 


CTT 


ACA 


ACC 


GCT 


432 


ATT 


CCT 


CTT 


TTT 


GCA 


GTT 


CAA 


AAT 


TAT 


CAA 


GTT 


CCT 


CTT 


TTA 


TCA 


GTA 


480 


TAT 


GTT 


CAA 


GCT 


GCA 


AAT 


TTA 


CAT 


TTA 


TCA 


GTT 


TTG 


AGA 


GAT 


GTT 


TCA 


528 


GTG 


TTT 


GGA 


CAA 


AGG 


TGG 


GGA 


TTT 


GAT 


GCC 


GCG 


ACT 


ATC 


AAT 


AGT 


CGT 


576 


TAT 


AAT 


GAT 


TTA 


ACT 


AGG 


CTT 


ATT 


GGC 


AAC 


TAT 


ACA 


GAT 


TAT 


GCT 


GTA 


624 



A l0577«(29MBOt' DOO 
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CGC TGG TAG AAT ACG GGA TTA GAA 
GAT TGG GTA AGG TAT AAT CAA TTT 
TTA GAT ATC GTT GCT CTG TTC CCG 
ATT CGA ACA GTT TCC CAA TTA ACA 
5 TTA GAA AAT TTT GAT GGT AGT TTT 

AGA AGT ATT AGG AGT CCA CAT TTG 
ATC TAT ACG GAT GCT CAT AGG GGT 
ATA ATG GCT TCT CCT GTA GGG TTT 
CTA TAT GGA ACT ATG GGA AAT GCA 
10 CAA CTA GGT CAG GGC GTG TAT AGA 
AGA CCT TTT AAT ATA GGG ATA AAT 
GGG ACA GAA TTT GCT TAT GGA ACC 
TAC AGA AAA AGC GGA ACG GTA GAT 
AAT AAC AAC GTG CCA CCT AGG CAA 
15 GTT TCA ATG TTT CGT TCA GGC TTT 

AGA GCT CCT ATG TTC TCT TGG ATA 
ATA ATT GCA TCG GAT AGT ATT ACT 
ACA CTT CAG TCA GGT ACT ACT GTT 
GGA GAT ATT CTT CGA CGA ACA AGT 
20 GTT AAT ATA AAT GGG CAA TTA CCC 

TAT GCC TCT ACT ACA AAT CTA AGA 
CGG ATT TTT GCT GGT CAA TTT AAC 
TTA ACA TTC CAA TCT TTT AGT TAC 
TTC CCA ATG AGC CAG AGT AGT TTC 
25 TCA GGG AAT GAA GTT TAT ATA GAC 
GCA ACA TTT GAA GCA GAA TAT GAT 
AAT GCG CTG TTT ACT TCT ATA AAC 
ACG GAT TAT CAT ATT GAT CAA GTA 
GAT GAA TTT TGT CTG GAT GAA AAG 
30 CAT GCG AAG CGA CTC AGT GAT GAG 

TTC AAA GGC ATC AAT AGG CAA CTA 
GAT ATT ACC ATC CAA AGA GGA GAT 
ACA CTA CCA GGT ACC TTT GAT GAG 
AAA ATC GAT GAA TCA AAA TTA AAA 
35 GGG TAT ATC GAA GAT AGT CAA GAC 

AAT GCA AAA CAT GAA ACA GTA AAT 



CGT GTA TGG GGA CCG GAT TCT AGA 672 
AGA AGA GAA TTA ACA CTA ACT GTA 720 
AAT TAT GAT AGT AGA AGA TAT CCA 768 
AGA GAA ATT TAT ACA AAC CCA GTA 816 
CGA GGC TCG GCT CAG GGC ATA GAA 864 
ATG GAT ATA CTT AAC AGT ATA ACC 912 
TAT TAT TAT TGG TCA GGG CAT CAA 960 

TCG GGG CCA GAA TTC ACT TTT CCG 1008 

GCT CCA CAA CAA CGT ATT GTT GCT 1056 

ACA TTA TCG TCC ACT TTA TAT AGA 1104 

AAT CAA CAA CTA TCT GTT CTT GAC 1152 

TCC TCA AAT TTG CCA TCC GCT GTA 1200 

TCG CTG GAT GAA ATA CCG CCA CAG 1248 

GGA TTT AGT CAT CGA TTA AGC CAT 1296 

AGT AAT AGT AGT GTA AGT ATA ATA 1344 

CAT CGT AGT GCT GAA TTT AAT AAT 1392 

CAA ATA CCA TTG GTA AAA GCA CAT 1440 

GTA AGA GGG CCC GGG TTT ACG GGA 1488 

GGA GGA CCA TTT GCT TAT ACT ATT 1536 

CAA AGG TAT CGT GCA AGA ATA CGC 1584 

ATT TAC GTA ACG* GTT GCA GGT GAA 1632 

AAA ACA ATG GAT ACC GGT GAC CCA 1680 

GCA ACT ATT AAT ACA GCT TTT ACA 1728 

ACA GTA GGT GCT GAT ACT TTT AGT 1776 

AGA TTT GAA TTG ATT CCA GTT ACT 1824 
TTA GAA AGA GCA CAA AAG GCG GTG 1872 
CAA ATA GGG ATA AAA ACA GAT GTG 1920 
TCC AAT TTA GTG GAT TGT TTA TCA 1968 
CGA GAA TTG TCC GAG AAA GTC AAA 2016 
CGG AAT TTA CTT CAA GAT CCA AAC 2064 
GAC CGT GGT TGG AGA GGA AGT ACG 2112 
GAC GTA TTC AAA GAA AAT TAT GTC 2160 
TGC TAT CCA ACA TAT TTG TAT CAA 2208 
GCC TTT ACC CGT TAT CAA TTA AGA 2256 
TTA GAA ATC TAT TTA ATT CGC TAC 2304 
GTG CCA GGT ACG GGT TCC TTA TGG 2352 
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CCG CTT TCA GCC CAA AGT CCA ATC GGA AAG TOT GGA GAG CCG AAT CGA 2400 

TGC GCG CCA CAC CTT GAA TGG AAT CCT GAC TTA GAT TGT TCG TGT AGG 2448 

GAT GGA GAA AAG TGT GCC CAT CAT TCG CAT CAT TTC TCC TTA GAC ATT 2496 

GAT GTA GGA TGT ACA GAC TTA AAT GAG GAC CTA GGT GTA TGG GTG ATC 2544 

5 TTT AAG ATT AAG ACG CAA GAT GGG CAC GCA AGA CTA GGG AAT CTA GAG 2592 

TTT CTC GAA GAG AAA CCA TTA GTA GGA GAA GCG CTA GCT CGT GTG AAA 2640 

AGA GCG GAG AAA AAA TGG AGA GAC AAA CGT GAA AAA TTG GAA TGG GAA 2688 

ACA AAT ATC GTT TAT AAA GAG GCA AAA GAA TCT GTA GAT GCT TTA TTT 2736 

GTA AAC TCT CAA TAT GAT CAA TTA CAA GCG GAT ACG AAT ATT GCC ATG 2784 

10 ATT CAT GCG GCA GAT AAA CGT GTT CAT AGC ATT CGA GAA GCT TAT CTG 2832 

CCT GAG CTG TCT GTG. ATT CCG GGT GTC AAT GCG GCT ATT TTT GAA GAA 2880 

TTA GAA GGG CGT ATT TTC ACT GCA TTC TCC CTA TAT GAT GCG AGA AAT 2928 

GTC ATT AAA AAT GGT GAT TTT AAT AAT GGC TTA TCC TGC TGG AAC GTG 2976 

AAA GGG CAT GTA GAT GTA GAA GAA CAA AAC AAC CAA CGT TCG GTC CTT 3024 

15 GTT GTT CCG GAA TGG GAA GCA GAA GTG 'TCA CAA GAA GTT CGT GTC TGT 3072 

CCG GGT CGT GGC TAT ATC CTT CGT GTC ACA GCG TAC AAG GAG GGA TAT 3120 

GGA GAA GGT TGC GTA ACC ATT CAT GAG ATC GAG AAC AAT ACA GAC GAA 3168 

CTG AAG TTT AGC AAC TGC GTA GAA GAG GAA ATC TAT CCA AAT AAC ACG 3216 

GTA ACG TGT AAT GAT TAT ACT GTA AAT CAA G/IA GAA TAC GGA GGT GCG 3264 

20 TAC ACT TCT CGT AAT CGA GGA TAT AAC GAA GCT CCT TCC GTA CCA GCT 3312 

GAT TAT GCG TCA GTC TAT GAA GAA AAA TCG TAT ACA GAT GGA CGA AGA 3360 

GAG AAT CCT TGT GAA TTT AAC AGA GGG TAT AGG GAT TAC ACG CCA CTA 3408 

CCA GTT GGT TAT GTG ACA AAA GAA TTA GAA TAC TTC CCA GAA ACC GAT 3456 

AAG GTA TGG ATT GAG ATT GGA GAA ACG GAA GGA ACA TTT ATC GTG GAC 3504 

25 AGC GTG GAA TTA CTC CTT ATG GAG GAA TAG 3534 

« 

6,7.6 DNA Sequence Encoding the EG11091 Crystal Protein (SEQ ID NO:29) 

ATG GAT AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 48 

AGT AAC CCT GAA GTA GAA GTA TTA GGT GGA GAA AGA ATA GAA ACT GGT 96 

30 TAC ACC CCA ATC GAT ATT TCC TTG TCG CTA ACG CAA TTT CTT TTG AGT 144 

GAA TTT GTT CCC GGT GCT GGA TTT GTG TTA GGA CTA GTT GAT ATA ATA 192 

TGG GGA ATT TTT GGT CCC TCT CAA TGG GAC GCA TTT CTT GTA CAA ATT 240 

GAA CAG TTA ATT AAC CAA AGA ATA GAA GAA TTC GCT AGG AAC CAA GCC 288 

ATT TCT AGA TTA GAA GGA CTA AGC AAT CTT TAT CAA ATT TAC GCA GAA 336 

35 TCT TTT AGA GAG TGG GAA GCA GAT CCT ACT AAT CCA GCA TTA AGA GAA 384 
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GAG 


ATG 


CGT 


ATT 


CAA 


TTC 


AAT 


GAC 


ATG 


AAC 


AGT 


GCC 


CTT 


ACA 


ACC 


GCT 


432 




ATT 


CCT 


CTT 


TTT 


GCA 


GTT 


CAA 


AAT 


TAT 


CAA 


GTT 


CCT 


CTT 


TTA 


TCA GTA 


480 




TAT 


GTT 


CAA 


GCT 


GCA 


AAT 


TTA* 


CAT 


TTA 


TCA 


GTT 


TTG 


AGA 


GAT 


GTT 


TCA 


528 




GTG 


TTT 


GGA 


CAA 


AGG 


TGG 


GGA 


TTT 


GAT 


GCC 


GCG 


ACT 


ATC 


AAT 


AGT 


CGT 


576 


5 


TAT 


AAT 


GAT 


TTA 


ACT 


AGG 


CTT 


ATT 


GGC 


AAC 


TAT 


ACA 


GAT 


TAT 


GCT 


GTA 


624 




CGC 


TGG 


TAC 


AAT 


ACG 


GGA 


TTA 


GAA 


CGT 


GTA 


TGG 


GGA 


CCG 


GAT 


TCT 


AGA 


672 




GAT 


TGG 


GTA 


AGG 


TAT 


AAT 


CAA 


TTT 


AGA 


AGA 


GAA 


TTA 


ACA 


CTA 


ACT 


GTA 


720 




TTA 


GAT 


ATC 


GTT 


GCT 


CTG 


TTC 


CCG 


AAT 


TAT 


GAT 


AGT 


AGA 


AGA 


TAT 


CCA 


768 




ATT 


CGA ACA 


GTT 


TCC 


c;^ 


TTA 


ACA 


AGA 


GAA 


ATT 


TAT 


ACA 


AAC 


CCA 


GTA 


816 


10 


TTA 


GAA 


AAT 


TTT 


GAT 


GGT 


AGT 


TTT 


CGA 


GGC 


TCG 


GCT 


CAG 


GGC 


ATA 


GAA 


864 




AGA 


AGT 


ATT 


AGG 


AGT 


CCA CAT 


TTG 


ATG 


GAT 


ATA 


CTT-.AAC 


AGT 


ATA 


ACC • 


912 




ATC 


TAT 


ACG 


GAT 


GCT 


CAT 


AGG 


GGT 


TAT 


TAT 


TAT 


TGG 


TCA 


GGG 


CAT 


CAA 


960 




ATA 


ATG 


GCT 


TCT 


CCT 


GTA 


GGG 


TTT 


TCG 


GGG 


CCA 


GAA 


TTC 


ACT 


TTT 


CCG 


1008 




CTA 


TAT 


GGA ACT 


ATG 


GGA AAT 


GCA GCT 


CCA CAA 


CAA 


CGT 


ATT 


GTT 


GCT 


1056 


15 


CAA 


CTA GGT 


CAG 


GGC 


GTG 


TAT 


AGA ACA 'TTA TCG 


TCC 


ACT 


TTA 


TAT 


AGA 


1104 




AGA 


CCT 


TTT 


AAT 


ATA 


GGG 


ATA 


AAT 


AAT 


CAA 


CAA 


CTA 


TCT 


GTT 


CTT 


GAC 


1152 




GGG 


ACA 


GAA 


TTT 


GCT 


TAT 


GGA ACC 


TCC 


TCA AAT 


TTG 


CCA 


TCC 


GCT 


GTA 


1200 




TAC 


AGA 


AAA 


AGC 


GGA 


ACG 


GTA 


GAT 


TCG 


CTG 


GAT 


GAA 


ATA 


CCG 


CCA 


CAG 


1248 




AAT 


AAC 


AAC 


GTG 


CCA 


CCT 


AGG 


CAA GGA 


TTT 


AGT 


CAT 


CGA 


TTA 


AGC 


CAT 


1296 


20 


GTT 


TCA 


ATG 


TTT 


CGT 


TCA GGC 


TTT 


AGT 


AAT 


AGT 


AGT 


GTA 


AGT 


ATA 


ATA 


1344 




AGA 


GCT 


CCT 


ATG 


TTC 


TCT 


TGG 


ATA 


CAT 


CGT 


AGT 


GCA 


ACT 


CTT 


ACA 


AAT 


1392 




ACA 


ATT 


GAT 


CCA 


GAG 


AGA 


ATT 


AAT 


CAA 


ATA 


CCT 


TTA 


GTG 


AAA 


GGA 


TTT 


1440 




AGA 


GTT 


TGG 


GGG 


GGC 


ACC 


TCT 


GTC 


ATT 


ACA 


GGA 


CCA 


GGA 


TTT 


ACA 


GGA 


1488 




GGG 


GAT 


ATC 


CTT 


CGA AGA AAT ACC TTT GGT GAT TTT GTA TCT 


CTA 


CAA 


1536 


25 


GTC 


AAT 


ATT 


AAT 


TCA 


CCA 


ATT 


ACC 


CAA 


AGA 


TAC 


CGT 


TTA 


AGA 


TTT 


CGT 


1584 




TAC 


GCT 


TCC 


AGT 


AGG 


GAT 


GCA 


CGA 


GTT 


ATA 


GTA 


TTA 


ACA 


GGA 


GCG 


GCA 


1632 




TCC 


ACA 


GGA GTG 


GGA GGC 


CAA GTT 


AGT 


GTA AAT 


ATG 


CCT 


CTT 


CAG 


AAA 


1680 




ACT 


ATG 


GAA ATA 


GGG 


GAG 


AAC 


TTA 


ACA 


TCT 


AGA ACA 


TTT 


AGA 


TAT 


ACC 


1728 




GAT 


TTT AGT 


AAT 


CCT 


TTT 


TCA 


TTT 


AGA 


GCT 


AAT 


CCA 


GAT 


ATA 


ATT 


GGG 


1776 


30 


ATA 


AGT 


GAA 


CAA 


CCT 


CTA 


TTT 


GGT 


GCA 


GGT 


TCT 


ATT 


AGT 


AGC 


GGT 


GAA 


1824 




CTT 


TAT 


ATA GAT 


AAA 


ATT 


GAA 


ATT 


ATT 


CTA 


GCA 


GAT 


GCA 


ACA 


TTT 


GAA 


1872 




GCA GAA TCT GAT TTA GAA AGA GCA CAA AAG GCG GTG AAT GCC CTG TTT 






ACT 


TCT 


TCC 


AAT 


CAA 


ATC 


GGG 


TTA 


AAA 


ACC 


GAT GTG ACG 


GAT 


TAT 


CAT 


1968 




ATT 


GAT 


CAA 


GTA 


TCC 


AAT 


TTA 


GTG 


GAT 


TGT 


TTA 


TCA 


GAT 


GAA 


TTT 


TGT 


2016 


35 


. CTG 


GAT 


GAA 


AAG 


CGA GAA TTG 


TCC 


GAG 


AAA 


GTC 


AAA 


CAT 


GCG 


AAG 


CGA 


2064 




CTC 


AGT 


GAT 


GAG 


CGG 


AAT 


TTA 


CTT 


CAA 


GAT 


CCA 


AAC 


TTC 


AGA 


GGG 


ATC 


2112 
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AAT 


AGA CAA 


CCA 


GAC 


CGT 


GGC 


TGG 


AGA 


GGA 


AGT 


ACA 


GAT 


ATT 


ACC 


ATC 


2160 


CAA GGA GGA 


GAT 


GAC 


GTA 


TTC 


AAA 


GAG 


AAT 


TAC 


GTC 


ACA 


CTA 


CCG 


GGT 


2208 


ACC 


GTT 


GAT 


GAG 


TGC 


TAT 


CCA 


ACG 


TAT 


TTA 


TAT 


CAG 


AAA 


ATA 


GAT 


GAG 


2256 


TCG 


AAA 


TTA 


AAA 


GCT 


TAT 


ACC 


CGT 


TAT 


GAA 


TTA 


AGA 


GGG 


TAT 


ATC 


GAA 


2304 


GAT 


AGT 


CAA 


GAC 


TTA 


GAA 


ATC 


TAT 


TTG 


ATC 


CGT 


TAC 


AAT 


GCA 


AAA 


CAC 


2352 


GAA 


ATA 


GTA 


AAT 


GTG 


CCA 


GGC 


ACG 


GGT 


TCC 


TTA 


TGG 


CCG 


CTT 


TCA 


GCC 


2400 


CAA 


AGT 


CCA 


ATC 


GGA 


AAG 


TGT 


GGA 


GAA 


CCG 


AAT 


CGA 


TGC 


GCG 


CCA 


CAC 


2448 


CTT 


GAA 


TGG 


AAT 


CCT 


GAT 


CTA 


GAT 


TGT 


TCC 


TGC 


AGA 


GAC 


GGG 


GAA 


AAA 


2496 


TGT 


GCA 


CAT 


CAT 


TCC 


CAT 


CAT 


TTC 


ACC 


TTG 


GAT 


ATT 


GAT 


GTT 


GGA 


TGT 


2544 


ACA 


GAC 


TTA 


AAT 


GAG 


GAC 


TTA 


GGT 


GTA 


TGG 


GTG 


ATA 


TTC 


AAG 


ATT 


AAG 


2592 


ACQ 


CAA 


GAT 


GGC 


CAT 


GCA AGA 


CTA 


GGG 


AAT 


CTA 


GAG 


TTT 


CTC 


GAA 


GAG 


2640 


AAA 


CCA 


TTA 


TTA 


GGG 


GAA GCA CTA 


GCT 


CGT 


GTG 


AAA AGA GCG 


GAG 


AAG 


2688 


AAG 


TGG 


AGA 


GAC 


AAA 


CGA GAG 


AAA 


CTG 


CAG 


TTG 


GAA 


ACA AAT 


ATT 


GTT 


2736 


TAT 


AAA GAG 


GCA 


AAA 


GAA 


TCT 


GTA 


GAT 


GCT 


TTA 


TTT 


GTA AAC 


TCT 


CAA 


2784 


TAT 


GAT 


AGA 


TTA 


CAA GTG GAT 


ACG 


AAC 


•ATC 


GCA ATG 


ATT 


CAT 


GCG 


GCA 


2832 


GAT 


AAA 


CGC 


GTT 


CAT 


AGA 


ATC 


CGG 


GAA 


GCG 


TAT 


CTG 


CCA 


GAG 


TTG 


TCT 


2880 


GTG 


ATT 


CCA 


GGT 


GTC 


AAT 


GCG 


GCC 


ATT 


TTC 


GAA GAA 


TTA GAG 


GGA 


CGT 


2928 


ATT 


TTT 


ACA 


GCG 


TAT 


TCC 


TTA 


TAT 


GAT 


GCG 


AGA 


AAT 


GTC 


ATT 


AAA 


AAT 


2976 


GGC 


GAT 


TTC 


AAT 


AAT 


GGC 


TTA 


TTA 


TGC 


TGG 


AAC 


GTG 


AAA 


GGT 


CAT 


GTA 


3024 


GAT 


GTA 


GAA 


GAG 


CAA AAC 


AAC 


CAC 


CGT 


TCG 


GTC 


CTT 


GTT ATC 


CCA 


GAA 


3072 


TGG 


GAG 


GCA 


GAA 


GTG 


TCA 


CAA 


GAG 


GTT 


CGT 


GTC 


TGT 


CCA 


GGT 


CGT 


GGC 


3120 


TAT 


ATC 


CTT 


CGT 


GTC 


ACA 


GCA 


TAT 


AAA 


GAG 


GGA 


TAT 


GGA 


GAG 


GGC 


TGC 


3168 


GTA 


ACG 


ATC 


CAT 


GAG 


ATC 


GAA 


GAC 


AAT 


ACA 


GAC 


GAA 


CTG 


AAA 


TTC 


AGC 


3216 


AAC 


TGT 


GTA 


GAA 


GAG 


GAA 


GTA 


TAT 


CCA AAC 


AAC 


ACA 


GTA 


ACG 


TGT 


AAT 


3264 


AAT 


TAT 


ACT 


GGG 


ACT 


CAA 


GAA 


GAA 


TAT 


GAG 


GGT 


ACG 


TAC 


ACT 


TCT 


CGT 


3312 


AAT 


CAA 


GGA 


TAT 


GAC 


GAA 


GCC 


TAT 


GGT 


AAT 


AAC 


CCT 


TCC 


GTA 


CCA 


GGT 


3360 


GAT 


TAC 


GCT 


TCA GTC 


TAT 


GAA GAA 


AAA 


TCG 


TAT 


ACA GAT 


GGA 


CGA AGA 


3408 


GAG 


AAT 


CCT 


TGT 


GAA 


TCT 


AAC 


AGA 


GGC 


TAT 


GGG 


GAT 


TAC 


ACA 


CCA CTA 


3456 


CCG 


GOT 


GGT 


TAT 


GTA 


ACA 


AAG 


GAT 


TTA 


GAG 


TAC 


TTC 


CCA GAG 


ACC 


GAT 


3504 


AAG 


GTA TGG 


ATT 


GAG 


ATC 


GGA 


GAA 


ACA 


GAA 


GGA ACA 


TTC 


ATC 


GTG 


GAT 


3552 


AGC 


GTG 


GAA 


TTA 


CTC 


CTT 


ATG 


GAG 


GAA 
















3579 



6.7.7 DNA Sequence Encoding the EG11768 Crystal Protein (SEQ ID NO:33) 

ATG GAT AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 48 
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A6T AAC CCT GAA GTA GAA GTA TTA 
TAG ACC CCA ATC GAT ATT TCC TTG 
GAA TTT GTT CCC GGT GCT GGA TTT 
TGG GGA ATT TTT GGT CCC TCT CAA 
5 GAA CAG TTA ATT AAC CAA AGA ATA 

ATT TCT AGA TTA GAA GGA CTA AGC 
TCT TTT AGA GAG TGG GAA GCA GAT 
GAG ATG CGT ATT CAA TTC AAT GAC 
ATT CCT CTT TTT GCA GTT CAA AAT 

10 TAT GTT CAA GCT GCA AAT TTA CAT 

GTG TTT GGA CAA AGG TGG GGA TTT 
TAT AAT GAT TTA ACT AGG CTT ATT 
CGC TGG TAG AAT ACG GGA TTA GAA 
GAT TGG GTA AGG TAT AAT CAA TTT 

15 TTA GAT ATC GTT GCT CTG TTC CCG 

ATT CGA ACA GTT TCC CAA TTA ACA 
TTA GAA AAT TTT GAT GGT AGT TTT 
AGA AGT ATT . AGG AGT CCA CAT TTG 
ATC TAT ACG GAT GCT CAT AGG GGT 

20 ATA ATG GCT TCT CCT GTA GGG TTT 
CTA TAT GGA ACT ATG GGA AAT GCA 
CAA CTA GGT CAG GGC GTG TAT AGA 
AGA CCT TTT AAT ATA GGG ATA AAT 
GGG ACA GAA TTT GCT TAT GGA ACC 

25 TAG AGA AAA AGC GGA ACG GTA GAT 

AAT AAC AAC GTG CCA CCT AGG CAA 
GTT TCA ATG TTT CGT TCA GGC TTT 
AGA GCT CCT ATG TTC TCT TGG ATA 
ATA AST GCA TOG GAT AGT ATT ACT 

30 ACA CTT CAO TCA GGT ACT ACT GTT 
GGA GAT ATT CTT CGA CGA ACA AGT 
GTT AAT ATA AAT GGG CAA TTA CCC 
TAT GCC TCT ACT ACA AAT CTA AGA 
CGG ATT TTT GCT GGT CAA TTT AAC 

35 TTA ACA TTC CAA TCT TTT AGT TAC 

TTC CCA ATG AGC CAG AGT AGT TTC 



GGT 


GGA GAA 


AGA 


ATA 


GAA 


ACT 


GGT 


96 


TCG 


CTA ACG 


CAA 


TTT 


CTT 


TTG 


AGT 


144 


GTG 


TTA GGA 


CTA 


GTT 


GAT 


ATA 


ATA 


192 


TGG 


GAC 


GCA 


TTT 


CTT 


GTA 


CAA 


ATT 


240 


GAA 


GAA TTC 


GCT 


AGG 


AAC 


CAA 


GCC 


288 


AAT 


CTT 


TAT 


CAA 


ATT 


TAC 


GCA 


GAA 


336 


CCT 


ACT 


AAT 


CCA 


GCA 


TTA 


AGA 


GAA 


384 


ATG 


AAC 


AGT 


GCC 


CTT 


ACA 


ACC 


GCT 


432 


TAT 


CAA 


GTT 


CCT 


CTT 


TTA 


TCA 


GTA 


480 


TTA 


TCA 


GTT 


TTG 


AGA 


GAT 


GTT 


TCA 


528 


GAT 


GGC GCG ACT ATC AAT AGT CGT 


' 576 


GGC 


AAC 


TAT 


ACA 


GAT 


TAT 


GCT 


GTA 


624 




GTA TGG 


GGA 


CCG 


GAT 


TCT 


AGA 


672 


AGA 


AGA 


GAA 


TTA 


ACA 


CTA 


ACT 


GTA 


720 


AAT' 


TAT 


GAT 


AGT 


AGA 


AGA 


TAT 


CCA 


768 


AGA 


GAA ATT 


TAT 


ACA 


AAC 


CCA 


GTA 


816 


CGA 


GGC 


TCG 


GCT 


CAG 


GGC 


ATA 


GAA 


864 




GAT 


ATA CTT 


AAC 


AGT 


ATA 


ACC 


912 


TAT 


TAT 


TAT 


TGG 


TCA 


GGG 


CAT 


CAA 


960 


TCG 


GGG 


CCA 


GAA 


TTC 


ACT 


TTT 


CCG 


1008 


GCT 


CCA 


CAA 


CAA 


CGT 


ATT 


GTT 


GCT 


1056 


ACA 


TTA 


TCG 


TCC 


ACT 


TTA 


TAT 


AGA 


1104 


AAT 


CAA 


CAA 


CTA 


TCT 


GTT 


CTT 


GAC 


1152 


TCC 


TCA 


AAT 


TTG 


CCA 


TCC 


GCT 


GTA 


1200 


TCG 


CTG 


GAT 


GAA 


ATA 


CCG 


CCA 


CAG 


1248 


GGA 


TTT 


AGT 


CAT 


CGA 


TTA 


AGC 


CAT 


1296 


AGT 


AAT 


AGT 


AGT 


GTA . AGT 


ATA 


ATA 


1344 


CAT 


CGT 


AGT 


GCT 


GAA 


TTT 


AAT 


AAT 


1392 


CAA ATA CCA 


TTG 


GTA 


AAA 


GCA 


CAT 


1440 


GTA 


AGA 


GGG 


CCC 


GGG 


TTT 


ACG 


GGA 


1488 


GGA 


GGA 


CCA 


TTT 


GCT 


TAT 


ACT 


ATT 


1536 


CAA 


AGG 


TAT 


CGT 


GCA AGA 


ATA 


CGC 


1584 


ATT 


TAC 


GTA ACG 


GTT 


GCA 


GGT 


GAA 


1632 


AAA ACA ATG 


GAT ACC 


GGT 


GAC 


CCA 


1680 


GCA ACT 


ATT 


AAT 


ACA 


GCT 


TTT 


ACA 


1728 


ACA 


GTA 


GGT 


GCT 


GAT 


ACT 


TTT 


AGT 


1776 
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TCA GGG AAT GAA GTT TAT ATA GAC AGA TTT GAA TTG ATT CCA GTT ACT 1824 

GCA ACA CTC GAG GCT GAA TAT AAT CTG GAA AGA GCG CAG AAG GCG GTG X872 

AAT GCG CTG TTT ACG TCT ACA AAC CAA CTA GGG CTA AAA ACA AAT GTA 1920 

ACG GAT TAT CAT ATT GAT CAA GTG TCC AAT TTA GTT ACG TAT TTA TCG 1968 

5 GAT GAA TTT TGT CTG GAT GAA AAG CGA GAA TTG TCC GAG AAA GTC AAA 2016 

CAT GCG AAG CGA CTC AGT GAT GAA CGC AAT TTA CTC CAA GAT TCA AAT 2064 

TTC AAA GAC ATT AAT AGG CAA CCA GAA CGT GGG TGG GGC GGA AGT ACA 2112 

GGG ATT ACC ATC CAA GGA GGG GAT GAC GTA TTT AAA GAA AAT TAC GTC 2160 

ACA CTA TCA GGT ACC TTT GAT GAG TGC TAT CCA ACA TAT TTG TAT CAA 2208 

10 AAA ATC GAT GAA TCA AAA TTA AAA GCC TTT ACC CGT TAT CAA TTA AGA 2256 

GGG TAT ATC GAA GAT AGT CAA GAC TTA GAA ATC TAT TTA ATT CGC TAG 2304 

AAT GCA AAA CAT GAA ACA GTA AAT GTG CCA GGT ACG GGT TCC TTA TGG 2352 

GCG CTT TCA GCC CAA AGT CCA ATC GGA AAG TGT GGA GAG CCG AAT CGA 2400 

TGC GCG CCA CAC CTT GAA TGG AAT CCT GAC TTA GAT TGT TCG TGT AGG 2448 

15 GAT GGA GAA AAG TGT GCC CAT CAT TCG 'CAT CAT TTC TCC TTA GAC ATT 2496 

GAT GTA GGA TGT ACA GAC TTA AAT GAG GAC CTA GGT GTA TGG GTG ATC 2544 

TTT AAG ATT AAG ACG CAA. GAT GGG CAC GCA AGA CTA GGG AAT CTA GAG 2592 

TTT CTC GAA GAG AAA CCA TTA GTA GGA GAA GCG CTA GCT CGT GTG AAA 2640 

AGA GCG GAG AAA AAA TGG AGA GAC AAA CGT GAA AAA TTG GAA TGG GAA 2688 

20 ACA AAT ATC GTT TAT AAA GAG GCA AAA GAA TCT GTA GAT GCT TTA TTT 2736 

GTA AAC TCT CAA TAT GAT CAA TTA CAA GCG GAT ACG AAT ATT GCC ATG 2784 

ATT CAT GCG GCA GAT AAA CGT GTT CAT AGC ATT CGA GAA GCT TAT CTG 2832 

CCT GAG CTG TCT GTG ATT CCG GGT GTC AAT GCG GCT ATT TTT GAA GAA 2880 

TTA GAA GGG CGT ATT TTC ACT GCA TTC TCC CTA TAT GAT GCG AGA AAT 2928 

25 GTC ATT AAA AAT GGT GAT TTT AAT AAT GGC TTA TCC TGC TGG AAC GTG 2976 

AAA GGG CAT GTA GAT GTA GAA GAA CAA AAC AAC CAA CGT TCG GTC CTT 3024 

GTT GTT CCG GAA TGG GAA GCA GAA GTG TCA CAA GAA GTT CGT GTC TGT 3072 

CCG GGT CGT GGC TAT ATC CTT CGT GTC ACA GCG TAC AAG GAG GGA TAT 3120 

GGA GAA GGT TGC GTA ACC ATT CAT GAG ATC GAG AAC AAT ACA GAC GAA 3168 

30 CTG AAG TTT AGC AAC TGC GTA GAA GAG GAA ATC TAT CCA AAT AAC ACG 3216 

GTA ACG TGT AAT GAT TAT ACT GTA AAT CAA GAA GAA TAC GGA GGT GCG 3264 

TAC ACT TCT CGT AAT CGA GGA TAT AAC GAA GCT CCT TCC GTA CCA GCT 3312 

GAT TAT GCG TCA GTC TAT GAA GAA AAA TCG TAT ACA GAT GGA CGA AGA 3360 

GAG AAT CCT TGT GAA TTT AAC AGA GGG TAT AGG GAT TAC ACG CCA CTA 3408 

35 CCA GTT GGT TAT GTG ACA AAA GAA TTA GAA TAC TTC CCA GAA ACC GAT 3456 

AAG GTA TGG ATT GAG ATT GGA GAA ACG GAA GGA ACA TTT ATC GTG GAC 3504 



A IOST79(29MBOI*DOC) 



-104- 



AGC GTG GAA TTA CTC CTT ATG GAG GAA TAG 



3S34 
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8. Sequence Listing 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Malvar, Thomas 

Gilmer, Amy Jelen 

(ii) TITLE OF INVENTION: BROAD -SPECTRUM DELTA- ENDOTOXINS 

(iii) NUMBER OF SEQUENCES: 35 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Arnold, White & Durkee 

^' (B) STREET: P.O. BOX 4433 

(C) CITY: Houston 

(D) STATE: Texas 

(E) COUNTRY: USA 

(F) ZIP: 77210 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

<C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US UnJoiOwn 

(B) FILING DATE: Concurrently Herewith 

(C) CLASSIFICATION: Unknovm 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/754/490 

(B) FILING DATE: 20-NOV-1996 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Kitchell, Barbara S. 

(B) REGISTRATION NUMBER: 33,928 

(C) REFERENCE /DOCKET NUMBER: MOBT:163 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 512/418-3000 

(B) TELEFAX: 512/474-7577 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ. ID N0:1: 
GGATAGCACT CATCAAAGGT ACC 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GAAGATATCC AATTCGAACA GTTTCCC 



(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 
CATATTCTGC CTCGAGTGTT GCAGTAAC 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 
CCCGATCGGC CGCATGC 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
. (A) LENGTH : 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ. ID NO:S: 
CATTGGAGCT CTCCATG 



(2) INFORMATION FOR SEQ ID NO; 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOIiOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCACTACGAT GTATCC 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
CATCGTAGTG CAACTCTTAC 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTTH: 39 base pairs 

(B) TYPE: nucleic, acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 
CCAAfiAAAAT ACTAGAGCTC TTGTTAAAAA AGGTGTTCC 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTTERISTICS : 

(A) LENGTH: 3531 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .3531 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATG GAT AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 
Met Asp Asn Ash Pro Asn lie Asn Glu Cys He Pro Tyr Asn Cys Leu 
15 10 15 

AGT AAC CCT GAA GTA GAA GTA TTA GOT GGA GAA AGA ATA GAA ACT GGT 
Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg He Glu Thr Gly 
20 25 30 

TAC ACC CCA ATC GAT ATT TCC TTG TCG CTA ACG CAA TTT CTT TTG AGT 
Tyr Thr Pro He Asp lie Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

GAA TTT GTT CCC GGT GCT GGA TTT GTG TTA GGA CTA GTT GAT ATA ATA 
Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 
50 55 60 

TGG GGA ATT TTT GGT CCC TCT CAA TGG GAC GCA TTT CTT GTA CAA ATT 
Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 

GAA CAG TTA ATT AAC CAA AGA ATA GAA GAA TTC GCT AGG AAC CAA GCC 
Glix Gin Leu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 

ATT TCT AGA TTA GAA GGA CTA AGC AAT CTT TAT CAA ATT TAC GCA GAA 
He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
100 105 110 

TCT TTT AGA GAG TGG GAA GCA GAT CCT ACT AAT CCA GCA TTA AGA GAA 
Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
115 120 125 

GAG ATG CGT ATT CAA TTC AAT GAC ATG AAC AGT GCC CTT ACA ACC GCT 
Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 

ATT CCT CTT TTT GCA GTT CAA AAT TAT CAA GTT CCT CTT TTA TCA GTA 
He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
145 150 155 160 

TAT GTT CAA GCT GCA AAT TTA CAT TTA TCA GTT TTG AGA GAT GTT TCA 
Tyr Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

GTG TTT GGA CAA AGG TGG GGA TTT GAT GCC GCG ACT ATC AAT AGT CGT 
Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
180 185 190 
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TAT AAT GAT TTA ACT AGG CTT ATT GGC AAC TAT ACA GAT TAT GCT GTA 
Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp Tyr Ala Val 
195 200 205 

CGC TGG TAG AAT ACQ GGA TTA GAA CGT GTA TGG GGA CCG GAT TCT AGA 
Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 2X5 220 

GAT TGG GTA AGG TAT AAT CAA TTT AGA AGA GAA TTA ACA CTA ACT GTA 
Asp Trp Val Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

TTA GAT ATC GTT GCT CTG TTC CCG AAT TAT GAT AGT AGA AGA TAT CCA 
Leu Asp He Val Ala Leu Phe Pro Asn Tyr Asp Ser Arg Arg Tyr Pro 

245 . 250 . . .- 255 . 

ATT CGA ACA GTT TCC CAA TTA ACA AGA GAA ATT TAT ACA AAC CCA GTA 
He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asa Pro Val 
260 265 270 

TTA GAA AAT TTT GAT GGT AGT TTT CGA GGC TCG GCT CAG GGC ATA GAA 
Leu Glu Asn Phe A3p Gly Ser Phe Arg 'Gly Ser Ala Gin Gly He Glu 
275 280 285 

AGA AGT ATT AGG AGT CCA CAT TTG ATG GAT ATA CTT AAC AGT ATA ACC 
Arg Ser He Arg Ser Pro His Leu Met Asp He Leu Asn Ser He Thr 
290 295 300 

ATC TAT ACG GAT GCT CAT AGG GGT TAT TAT TAT TGG TCA GGG CAT CAA 
He Tyr Thr Asp Ala His Arg Gly Tyr Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

ATA ATG GCT TCT CCT GTA GGG TTT TCG GGG CCA GAA TTC ACT TTT CCG 
He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325 330 335 

CTA TAT GGA ACT ATG GGA AAT GCA GCT CCA CAA CAA CGT ATT GTT GCT 
Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
340 345 350 

CAA CTA GGT CAG GGC GTG TAT AGA ACA TTA TCG TCC ACT TTA TAT AGA 
Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

AGA CCT TTT AAT ATA GGG ATA AAT AAT CAA CAA CTA TCT GTT CTT GAC 
Arg Pro Phe Asn He Gly He Asn Asn Gin Gin Leu Ser Val Leu Asp 
370 375 380 

GGG ACA GAA TTT GCT TAT GGA ACC TCC TCA AAT TTG CCA TCC GCT GTA 
Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 
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TAC AGA AAA AGC GGA ACQ GTA GAT TCG CTG GAT GAA ATA CCG CCA CAG 1248 
Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu He Pro Pro Gin 
405 410 415 

5 AAT AAC AAC GTG CCA CCT AGG CAA GGA TTT AGT CAT CGA TTA AGC CAT 1296 

Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 

GTT TCA ATG TTT CGT TCA GGC TTT AGT AAT AGT AGT GTA AGT ATA ATA 1344 
10 Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser He He 
435 440 445 

AGA GCT CCA ATG TTT TCT TGG ACG CAC CGT AGT GCA ACC CCT ACA AAT 1392 
Arg Ala Pro Met Phe Ser Trp Thr His Arg Ser Ala Thr Pro Thr Asn 
15 ; 450 . 455 460 , 

ACA ATT GAT CCG GAG AGG ATT ACT CAA ATA CCA TTG GTA AAA GCA CAT 1440 
Thr He Asp Pro Glu Arg He Thr Gin He Pro Leu Val Lys Ala His 
465 470 475 480 



20 



40 



ACA CTT CAG TCA GGT ACT ACT GTT GTA AGA GGG CCC GGG TTT ACG GGA 1488 
Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 
485 490 495 



25 GGA GAT ATT CTT CGA CGA ACA AGT GGA GGA CCA TTT GCT TAT ACT ATT 1536 

Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 

GTT AAT ATA AAT GGG CAA TTA CCC CAA AGG TAT CGT GCA AGA ATA CGC 1584 
30 Val Asn He Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
515 520 525 

TAT GCC TCT ACT ACA AAT CTA AGA ATT TAC GTA ACG GTT GCA GGT GAA 1632 
Tyr Ala Ser Thr Thr Asn Leu Arg He Tyr Val Thr Val Ala Gly Glu 
35 530 535 540 

CGG ATT TTT GCT GGT CAA TTT AAC AAA ACA ATG GAT ACC GGT GAC CCA 1680 
Arg He Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 
545 550 555 560 



TTA ACA TTC CAA TCT TTT AGT TAC GCA ACT ATT AAT ACA GCT TTT ACA 1728 
Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 
565 570 575 



45 TTC CCA ATG AGC CAG AGT AGT TTC ACA GTA GGT GCT GAT ACT TTT AGT 1776 

Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 

TCA GGG AAT GAA GTT TAT ATA GAC AGA TTT GAA TTG ATT CCA GTT ACT 1824 
50 Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 605 
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GCA ACA TTT GAA GCA GAA TAT GAT TTA GAA AGA GCA CAA AAG GCG GTG 
Ala Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg Ala Gin Lys Ala Val 
610 615 620 

AAT GCG CTG TTT ACT TCT ATA AAC CAA ATA GGG ATA AAA ACA GAT GTG 
Asn Ala Leu Phe Thr Ser He Asn Gin lie Gly He Lys Thr Asp Val 
625 630 635 640 

ACG GAT TAT CAT ATT GAT CAA GTA TCC AAT TTA GTG GAT TGT TTA TCA 
Thr Asp Tyr His lie Asp Gin Val Ser Asn Leu Val Asp Cys Leu Ser 
645 650 655 

GAT GAA TTT TGT CTG GAT GAA AAG CGA GAA TTG TCC GAG AAA GTC AAA 
Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 665 670 

CAT GCG AAG CGA CTC AGT GAT GAG CGG AAT TTA CTT CAA GAT CCA AAC 
His Ala Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn 
675 680 685 

TTC AAA GGC ATC AAT AGG CAA CTA GAC CGT GGT TGG AGA GGA AGT ACG 
Phe Lys Gly He Asn Arg Gin Leu Asp Arg Gly Trp Arg Gly Ser Thr 
690 695 700 

GAT ATT ACC ATC CAA AGA GGA GAT GAC GTA TTC AAA GAA AAT TAT GTC 
Asp He Thr He Gin Arg Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 720 

ACA CTA CCA GGT ACC TTT GAT GAG TCC TAT CCA ACA TAT TTG TAT CAA 
Thr Leu Pro Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 

AAA ATC GAT GAA TCA AAA TTA AAA GCC TTT ACC CGT TAT CAA TTA AGA 
Lys He Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 745 750 

GGG TAT ATC GAA GAT AGT CAA GAC TTA GAA ATC TAT TTA ATT CGC TAC 
Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr 
755 760 765 

AAT GCA AAA CAT GAA ACA GTA AAT GTG CCA GGT ACG GGT TCC TTA TGG 
Asn Ala Lys His Glu Thr Val Asn Val Pro Gly Thr Gly Ser Leu Trp 
770 775 780 

CCG CTT TCA GCC CAA AGT CCA ATC GGA AAG TGT GGA GAG CCG AAT CGA 
Pro Leu Ser Ala Gin Ser Pro He Gly Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 

TGC GCG CCA CAC CTT GAA TGG AAT CCT GAC TTA GAT TGT TCG TGT AGG 
Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 
805 810 815 
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10 



15. 



20 



25 



30 



35 



40 



45 



50 



GAT GGA GAA AAG TGT GCC CAT CAT TCG CAT CAT TTC TCC TTA GAC ATT 
Asp Gly Glu Lys Cys Ala His His Ser His His Phe Ser Leu Asp lie 
820 825 830 

GAT GTA GGA TGT ACA GAC TTA AAT GAG GAC CTA GGT GTA TGG GTG ATC 
Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val lie 
835 840 845 

TIT AAG ATT AAG ACG CAA GAT GGG CAC GCA AGA CTA GGG AAT CTA GAG 
Phe Lys lie Lys Thr Gin Asp Gly His Ala Arg Leu Gly Asn Leu Glu 
850 855 860 . 

TTT CTC GAA GAG AAA CCA TTA GTA GGA GAA GCG CTA GCT CGT GTG AAA 
Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 7 870. V 875 880 

AGA GCG GAG AAA AAA TGG AGA GAC AAA CGT GAA AAA TTG GAA TGG GAA 
Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys Leu Glu Trp Glu 
885 890 895 

ACA AAT ATC GTT TAT AAA GAG GCA AAA GAA TCT GTA GAT GCT TTA TTT 
Thr Asn lie Val Tyr Lys Glu Ala Lys 'Glu Ser Val Asp Ala Leu Phe 
900 905 910 

GTA AAC TCT CAA TAT GAT CAA TTA CAA GCG GAT ACG AAT ATT GCC ATG 
Val Asn Ser Gin Tyr Asp Gin Leu Gin Ala Asp Thr Asn lie Ala Met 
915 920 925 

ATT CAT GCG GCA GAT AAA CGT GTT CAT AGC ATT CGA GAA GCT TAT CTG 
lie His Ala Ala Asp Lys Arg Val His Ser He Arg Glu Ala Tyr Leu 
930 935 940 

CCT GAG CTG TCT GTG ATT CCG GGT GTC AAT GCG GCT ATT TTT GAA GAA 
Pro Glu Leu ser Val lie Pro Gly Val Asn Ala Ala He Phe Glu Glu 
945 950 955 960 

TTA GAA GGG CGT ATT TTC ACT GCA TTC TCC CTA TAT GAT GCG AGA AAT 
Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 
965 970 975 

GTC ATT AAA AAT GGT GAT TTT AAT AAT GGC TTA TCC TGC TGG AAC GTG 
Val He Lys Asn Gly Asp Phe Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 

AAA GGG CAT GTA GAT GTA GAA GAA CAA AAC AAC CAA CGT TCG GTC CTT 
Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 

GTT GTT CCG GAA TGG GAA GCA GAA GTG TCA CAA GAA GTT CGT GTC TGT 
Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 



2496 



2544 



2592 



2640 



2688 



2736 



2784 



2832 



2880 



2928 



2976 



3024 



3072 
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CCG GGT CGT GGC TAT ATC CTT CGT GTC ACA GCG TAC AAG GAG GGA TAT 3120 
Pro Gly Arg Gly Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 

5 GGA GAA GGT TGC GTA ACC ATT CAT GAG ATC GAG AAC AAT ACA GAC GAA 3168 

Gly Glu Gly Cys Val Thr He His Glu lie Glu Asn Asn Thr Asp Glu 
1045 1050 1055 

CTG AAG TTT AGC AAC TGC GTA GAA GAG GAA ATC TAT CCA AAT AAC ACG 3216 
10 Leu Lys Phe Ser Asn Cys Val Glu Glu Glu He Tyr Pro Asn Asn Thr 
1060 1065 1070 

GTA ACG TGT AAT GAT TAT ACT GTA AAT CAA GAA GAA TAC GGA GGT GCG 3264 
Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
15 1075 1080 1085 

TAC ACT TCT CGT AAT CGA GGA TAT AAC GAA GCT CCT TCC GTA CCA GCT 3312 
Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 



20 



40 



50 



GAT TAT GCG TCA GTC TAT GAA GAA AAA TCG TAT ACA GAT GGA CGA AGA 3360 
Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1105 1110 1115 1120 



25 GAG AAT CCT TGT GAA TTT AAC AGA GGG TAT AGO GAT TAC ACG CCA CTA 3408 

Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 

CCA GTT GGT TAT GTG ACA AAA GAA TTA GAA TAC TTC CCA GAA ACC GAT 3456 
30 Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 

AAG GTA TGG ATT GAG ATT GGA GAA ACG GAA GGA ACA TTT ATC GTG GAC 3504 
Lys Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
35 1155 1160 1165 

AGC GTG GAA TTA CTC CTT ATG GAG GAA 3531 
Ser Val Glu Leu Leu Leu Met Glu Glu 
1170 1175 



(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1177 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyr Asn Cys Leu 
15 10 15 
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Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg He Glu Thr Gly 
20 25 30 

Tyr Thr Pro lie Asp lie Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 
50 55 60 

Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 

Glu Gin Leu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 

. 85 , ^? . 

He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
100 105 110. 

Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
115 120 125 

Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 

He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
145 ISO 155 160 

Tyr Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
180 185 190 

Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp Tyr Ala Val 
195 200 205 

Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 

Asp Trp Val Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

Leu Asp He Val Ala Leu Phe Pro Asn Tyr Asp Ser Arg Arg Tyr Pro 
245 250 255 

He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asn Pro Val 
260 265 270 

Leu Glu Asn Phe Asp Gly Ser Phe Arg Gly Ser Ala Gin Gly He Glu 
275 280 285 

Arg Ser He Arg Ser Pro His Leu Met Asp He Leu Asn Ser He Thr 
290 295 300 
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He Tyr Thr Asp Ala His Arg Gly Tyr Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325 330 335 

Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
340 345 350 

Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 



Arg Pro Phe Asn He Gly He Asn Asn Gin Gin Leu Ser Val Leu Asp 
15 370 375 . • 380 

Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

20 Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu He Pro Pro Gin 

405 410 415 



Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 

Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser He He 
435 440 445 



Arg Ala Pro Met Phe Ser Trp Thr His Arg Ser Ala Thr Pro Thr Asn 
30 450 455 460 

Thr He Asp Pro Glu Arg He Thr Gin He Pro Leu Val Lys Ala His 
465 470 475 480 

35 Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 

485 490 495 



Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 

Val Asn He Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
515 520 525 



Tyr Ala Ser Thr Thr Asn Leu Arg He Tyr Val Thr Val Ala Gly Glu 
45 530 535 540 

Arg He Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 
545 550 555 560 

50 Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe thr 

565 570 575 

Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 
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Ser Gly Asn Glu Val Tyr lie Asp Arg Phe Glu Leu lie Pro Val Thr 
595 600 605 

Ala Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg Ala Gin Lys Ala Val 
610 615 620 

Asn Ala Leu Phe Thr Ser He Asn Gin He Gly He Lys Thr Asp Val 
625 630 635 640 

Thr Asp Tyr His He Asp Gin Val Ser Asn Leu Val Asp Cys Leu Ser 
645 650 655 

Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 , . 665 ^"'^ . 

His Ala Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn 
675 680 685 

Phe Lys Gly He Asn Arg Gin Leu Asp Arg Gly Trp Arg ciy Ser Thr 
690 695 700 

Asp He Thr He Gin Arg Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 720 

Thr Leu Pro Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 

Lys He Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 745 750 

Gly Tyr He Glu Asp Ser Gin" Asp Leu Glu He Tyr Leu He Arg Tyr 
755 760 765 

Asn Ala Lys His Glu Thr Val Asn Val Pro Gly Thr Gly Ser Leu Trp 
770 775 780 

Pro Leu Ser Ala Gin Ser Pro He Gly Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 

Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 
805 810 815 

Asp Gly Glu Lys Cys Ala His His Ser His His Phe Ser Leu Asp He 
820 825 830 

Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val He 
835 840 845 

Phe Lys He Lys Thr Gin Asp Gly His Ala Arg Leu Gly Asn Leu Glu 
850 855 860 

Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 
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Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys Leu Glu Trp Glu 
885 890 895 

Thr Asa He Val Tyr Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe 
900 905 910 

Val Asn Ser Gin Tyr Asp Gin Leu Gin Ala Asp Thr Asn lie Ala Met 
915 920 925 

He His Ala Ala Asp Lys Arg Val His Ser He Arg Glu Ala Tyr Leu 
930 935 940 

Pro Glu Leu Ser Val He Pro Gly Val Asn Ala Ala He Phe Glu Glu 
945 , . 950 . 955 96.0 

Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 
965 970 975 

Val He Lys Asn Gly Asp Phe Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 

Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 

Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 

Pro Gly Arg Gly Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 

Gly Glu Gly Cys Val Thr He His Glu He Glu Asn Asn Thr Asp Glu 
1045 1050 1055 

Leu Lys Phe Ser Asn Cys Val Glu Glu Glu He Tyr Pro Asn Asn Thr 
1060 1065 1070 

Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 1080 1085 

Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 HOO 

Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1105 1110 HIS 1120 

Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 

Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 

Lys Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1155 1160 1165 
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Ser Val Glu Leu Leu Leu Met Glu Glu 
1170 1175 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3531 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION^ 1..3531 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATG GAT AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 
Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyr Asn Cys Leu 
1 5 10 15 

AGT AAC CCT GAA GTA GAA GTA TTA GGT GGA GAA AGA ATA GAA ACT GGT 
Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg He Glu Thr Gly 
20 25 30 

TAC ACC CCA ATC GAT ATT TCC TTG TCG CTA ACG CAA TTT CTT TTG AGT 
Tyr Thr Pro He Asp He Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

GMi TTT GTT CCC GGT GCT GGA TTT GTG TTA GGA CTA GTT GAT ATA ATA 
Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 
50 55 60 

TGG GGA ATT TTT GGT CCC TCT CAA TGG GAC GCA TTT CTT GTA CAA ATT 
Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 

GAA CAG TTA ATT AAC CAA AGA ATA GAA GAA TTC GCT AGG AAC CMl GCC 
Glu Gin Leu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 

ATT TCT AGA TTA GAA GGA CTA AGC AAT CTT TAT CAA ATT TAC GCA GAA 
He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
100 105 110 

TCT TTT AGA GAG TGG GAA GCA GAT CCT ACT AAT CCA GCA TTA AGA GAA 
Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
115 120 125 

GAG ATG CGT ATT CAA TTC AAT GAC ATG AAC AGT GCC CTT ACA ACC GCT 
Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 
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ATT CCT CTT TTT GCA GTT CAA AAT TAT CAA GTT CCT CTT TTA TCA GTA 
lie Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
145 150 155 160 

TAT GTT CAA GCT GCA AAT TTA CAT TTA TCA GTT TTG AGA GAT GTT TCA 
Tyr Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

GTG TTT GGA CAA AGG TGG GGA TTT GAT GCC GCG ACT ATC AAT AGT CGT 
Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr lie Asn Ser Arg 
180 185 190 

TAT AAT GAT TTA ACT AGG CTT ATT GGC AAC TAT ACA GAT TAT GCT GTA 
Tyr hsn Asp Leu Thr Arg Leu lie Gly Asn Tyr. Thr Asp Tyr Ala Val 
195 200 • 205 

CGC TGG TAG AAT ACG GGA TTA GAA CGT GTA TGG GGA CCG GAT TCT AGA 
Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 

GAT TGG GTA AGG TAT AAT CAA TTT AGA' AGA GAA TTA ACA CTA ACT GTA 
Asp Trp Val Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

TTA GAT ATC GTT GCT CTG TTC CCG AJ5LT TAT GAT AGT AGA AGA TAT CCA 
Leu Asp He Val Ala Leu Phe Pro Asn Tyr Asp Ser Arg Arg Tyr Pro 
245 250 255 

ATT CGA ACA GTT TCC CAA TTA ACA AGA GAA ATT TAT ACA AAC CCA GTA 
He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asn Pro Val 
260 265 270 

TTA GAA AAT TTT GAT GGT AGT TTT CGA GGC TCG GCT CAG GGC ATA GAA 
Leu Glu Asn Phe Asp Gly Ser Phe Arg Gly Ser Ala Gin Gly He Glu 
275 280 285 

AGA AGT ATT AGG AGT CCA CAT TTG ATG GAT ATA CTT AAC AGT ATA ACC 
Arg Ser He Arg Ser Pro His Leu Met Asp He Leu Asn Ser He Tfir 
290 295 300 

ATC TAT ACG GAT GCT CAT AGG GGT TAT TAT TAT TGG TCA GGG CAT CAA 
He tyr Thr Asp Ala His Arg Gly Tyr Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

ATA ATG GCT TCT CCT GTA GGG TTT TCG GGG CCA GAA TTC ACT TTT CCG 
He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325 330 335 

CTA TAT GGA ACT ATG GGA AAT GCA GCT CCA CAA CAA CGT ATT GTT GCT 
Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg lie Val Ala 
340 345 350 
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CPA CTA GGT CAG GGC GTG TAT AGA ACA TTA TCX5 TCC ACT TTA TAT AGA 
Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

AGA CCT TTT AAT ATA GGG ATA AAT AAT CAA CAA CTA TCT GTT CTT GAC 
Arg Pro Phe Asn He Gly He Asn Asn Gin Gin Leu Ser Val Leu Asp 
370 375 380 

GGG ACA GAA TTT GCT TAT GGA ACC TCC TCA AAT TTG CCA TCC GCT GTA 
Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

TAC AGA AAA AGC GGA ACG GTA GAT TCG CTG GAT GAA ATA CCG CCA CAG 
Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu He Pro Pro Gin 
405 ^ 410 415 

AAT AAC AAC GTG CCA CCT AGG CAA GGA TTT AGT CAT CGA TTA AGC CAT 
Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 

GTT TCA ATG TTT CGT TCA GGC TTT AGT AAT AGT AGT GTA AGT ATA ATA 
Val Ser Met Phe Arg Ser Gly Phe Ser 'Asn Ser Ser Val Ser He He 
435 440 445 

AGA GCT CCA ATG TTT TCT TGG ACG CAC CGT AGT GCA ACC CCT ACA AAT 
Arg Ala Pro Met Phe Ser Trp Thr His Arg Ser Ala Thr Pro Thr Asn 
450 455 460 

ACA ATT GAT CCG GAG AGG ATT ACT CAA ATA CCA TTG GTA AAA GCA CAT 
Thr He Asp Pro Glu Arg He Thr Gin He Pro Leu Val Lys Ala His 
465 470 475 480 

ACA CTT CAG TCA GGT ACT ACT GTT GTA AGA GGG CCC GGG TTT ACG GGA 
Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 
485 490 495 

GGA GAT ATT CTT CGA CGA ACA AGT GGA GGA CCA TTT GCT TAT ACT ATT 
Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 

GTT AAT ATA AAT GGG CAA TTA CCC CAA AGG TAT CGT GCA AGA ATA CGC 
Val Asn He Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
SIS 520 525 

TAT GCC TCT ACT ACA AAT CTA AGA ATT TAC GTA ACG GTT GCA GGT GAA 
Tyr Ala Ser Thr Thr Asn Leu Arg He Tyr Val Thr Val Ala Gly Glu 
530 535 540 

CGG ATT TTT GCT GGT CAA TTT AAC AAA ACA ATG GAT ACC GGT GAC CCA 
Arg He Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 
545 550 555 560 



A IOS71«(19MBO|I.DOO 



-127- 



TTA ACA TTC CAA TCT TTT AGT TAG GCA ACT ATT AAT ACA GCT TTT ACA 1728 
Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 

565 570 575 

5 TTC CCA ATG AGC CAG AGT AGT TTC ACA GTA GGT GCT GAT ACT TTT AGT 1776 

Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 

TCA GGG AAT GAA GTT TAT ATA GAC AGA TTT GAA TTG ATT CCA GTT ACT 1824 
10 Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 605 

GCA ACA CTC GAG GCT GAA TAT AAT CTG GAA AGA GCG CAG AAG GCG GTG 1872 
Ala Thr Leu Glu Ala Glu Tyr Asn lieu Glu Arg Ala Gin Lys Ala Val 
15 610 615 620 

AAT GCG CTG TTT ACG TCT ACA AAC CAA CTA GOG CTA AAA ACA AAT GTA 1920 
Asn Ala Leu Phe Thr Ser Thr Asn Gin Leu Gly Leu Lys Thr Asn Val 
625 630 635 640 



20 



ACG GAT TAT CAT ATT GAT CAA GTG TCC AAT TTA GTT ACG TAT TTA TCG 1968 
Thr Asp Tyr His He Asp Gin Val Ser -Asn Leu Val Thr Tyr Leu Ser 
645 650 655 



25 GAT GAA TTT TGT CTG GAT GAA AAG CGA GAA TTG TCC GAG AAA GTC AAA 2016 

Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 665 670 

CAT GCG AAG CGA CTC AGT GAT GAA CGC AAT TTA CTC CAA GAT TCA AAT 2064 
30 His Ala Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Ser Asn 
675 680 685 

TTC AAA GAC ATT AAT AGO CAA CCA GAA CGT GGG TGG GGC GGA AGT ACA 2112 
Phe Lys Asp He Asn Arg Gin Pro Glu Arg Gly Trp Gly Gly Ser Thr 
35 690 695 700 

GGG ATT ACC ATC CAA GGA GGG GAT GAC GTA TTT AAA GAA AAT TAG GTC 2160 
Gly He Thr He Gin Gly Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 72b 

40 

ACA CTA TCA GGT ACC TTT GAT GAG TGC TAT CCA ACA TAT TTG TAT CAA 2208 
Thr Leu Ser Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 

45 AAA ATC GAT GAA TCA AAA TTA AAA GCC TTT ACC CGT TAT CAA TTA AGA 2256 

Lys He Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 745 750 

GGG TAT ATC GAA GAT AGT CAA GAC TTA GAA ATC TAT TTA ATT CGC TAC 2304 
50 Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr 
755 760 765 
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AAT GCA AAA CAT GAA ACA GTA AAT GTG CCA GGT ACG GGT TCC TTA TGG 2352 
Asn Ala Lys His Glu Thr Val Asn Val Pro Gly Thr Gly Ser Leu Trp 
770 775 780 

5 COG CTT TCA GCC CAA AGT CCA ATC GGA AAG TGT GGA GAG CCG AAT CGA 2400 

Pro Leu Ser Ala Gin Ser Pro He Gly Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 

TGC GCG CCA CAC CTT GAA TGG AAT CCT GAC TTA GAT TGT TCG TGT AGG 2448 
10 Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 

805 810 815 

GAT GGA GAA AAG TGT GCC CAT CAT TCG CAT CAT TTC TCC TTA GAC ATT 2496 
Asp Gly Glu Lys Cys Ala His His Ser His His Phe Ser Leu Asp He 
15 B2p , 825 830 . 

GAT GTA GGA TGT ACA GAC TTA AAT GAG GAC CTA GGT GTA TGG GTG ATC 2544 
Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val He 
835 840 845 

20 

TTT AAG ATT AAG ACG CAA GAT GGG CAC GCA AGA CTA GGG AAT CTA GAG 2592 
Phe Lys He Lys Thr Gin Asp Gly His 'Ala Arg Leu Gly Asn Leu Glu 
850 855 860 

25 TTT CTC GAA GAG AAA CCA TTA GTA GGA GAA GCG CTA GCT CGT GTG AAA 2640 

Phe Leu Glu Glu Lys Pro, Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 

AGA GCG GAG AAA AAA TGG AGA GAC AAA CGT GAA AAA TTG GAA TGG GAA 2688 
30 Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys Leu Glu Trp Glu 

885 890 895 

ACA AAT ATC GTT TAT AAA GAG GCA AAA GAA TCT GTA GAT GCT TTA TTT 2736 
Thr Asn He Val Tyr Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe 
35 900 905 910 

GTA AAC TCT CAA TAT GAT CAA TTA CAA GCG GAT ACG AAT ATT GCC ATG 2784 
Val Asn Ser Gin Tyr Asp Gin Leu Gin Ala Asp Thr Asn He Ala Met 
915 920 925 

40 

ATT CAT GCG GCA GAT AAA CGT GTT CAT AGC ATT CGA GAA GCT TAT CTG 2832 
He His Ala Ala Asp Lys Arg Val His Ser He Arg Glu Ala Tyr Leu 
930 935 940 

45 CCT GAG CTG TCT GTG ATT CCG GGT GTC AAT GCG GCT ATT TTT GAA GAA 2880 

Pro Glu Leu Ser Val He Pro Gly Val Asn Ala Ala He Phe Glu Glu 
945 950 955 960 

TTA GAA GGG CGT ATT TTC ACT GCA TTC TCC CTA TAT GAT GCG AGA AAT 2928 
50 Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 

965 970 975 



A lOSTT^n^MBOP DOO 



-129- 



GTC ATT AAA AAT GGT GAT TTT AAT AAT GGC TTA TCC TGC TOG AAC GTG 
Val He Lys Asn Gly Asp Phe Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 

AAA GGG CAT GTA GAT GTA GAA GAA CAA AAC AAC CAA CGT TCG GTC CTT 
Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 

GTT GTT CCG GAA TGG GAA GCA GAA GTG TCA CAA GAA GTT CGT GTC TGT 
Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 

CCG GGT CGT GGC TAT ATC CTT CGT GTC ACA GCG TAC AAG GAG GGA TAT 
Pro Gly Arg Gly Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 ^ . 1.030 1035 1Q40 

GGA GAA GGT TGC GTA ACC ATT CAT GAG ATC GAG AAC AAT ACA GAC GAA 
Gly Glu Gly Cys Val Thr He His Glu He Glu Asn Asn Thr Asp Glu 
1045 1050 1055 

CTG AAG TTT AGC AAC TGC GTA GAA GAG GAA ATC TAT CCA AAT AAC ACG 
Leu Lys Phe Ser Asn Cys Val Glu Glu 'Glu He Tyr Pro Asn Asn Thr 
1060 1065 1070 

GTA ACG TGT AAT GAT TAT ACT GTA AAT CAA GAA GAA TAC GGA GGT GCG 
Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 1080 1085 

TAC ACT TCT CGT AAT CGA GGA TAT AAC GAA GCT CCT TCC GTA CCA GCT 
Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 

GAT TAT GCG TCA GTC TAT GAA GAA AAA TCG TAT ACA GAT GGA CGA AGA 
Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1105 1110 1115 1120 

GAG AAT CCT TGT GAA TTT AAC AGA GGG TAT AGG GAT TAC ACG CCA CTA 
Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 

CCA GTT GGT TAT GTG ACA AAA GAA TTA GAA TAC TTC CCA GAA ACC GAT 
Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 

AAG GTA TGG ATT GAG ATT GGA GAA ACG GAA GGA ACA TTT ATC GTG GAC 
Lys Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1155 1160 1165 

AGC GTG GAA TTA CTC CTT ATG GAG GAA 
Ser Val Glu Leu Leu Leu Met Glu Glu 
1170 1175 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1177 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Asp Asn Asn Pro Asn lie Asn Glu Cys lie Pro Tyr Asn Cys Leu 
15 10 IS 

Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg lie Glu Thr Gly 
20 25 . 30 

Tyr Thr Pro lie Asp lie Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 
50 55 ' 60 

Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 

Glu Gin Leu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 

He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
100 105 110 

Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
115 120 125 

Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 

He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
145 150 155 160 

Tyr Val-Gln Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
180 IBS 190 

Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp Tyr Ala Val 
195 200 205 

Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 
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Asp Trp Val Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

Leu Asp lie Val Ala Leu Phe Pro Asn Tyx Asp Ser Arg Arg Tyr Pro 
245 250 255 

lie Arg Thr Val Ser Gin Leu Thr Arg Glu lie Tyr Thr Asn Pro Val 
260 265 270 

Leu Glu Asn Phe Asp Gly Ser Phe Arg Gly Ser Ala Gin Gly lie Glu 
275 280 285 

Arg Ser lie Arg Ser Pro His Leu Met Asp lie Leu Asn Ser lie Thr 
290 295 300 

lie Tyr Thr Asp Ala His Arg Gly Tyr Tyr tyr Trp Ser Gly His Gin 
305 310 315 320 

He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 

325 330 335 

Leu Tyr Gly Thr Met Gly Asn Ala Ala* Pro Gin Gin Arg lie Val Ala 
340 345 350 

Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

Arg Pro Phe Asn He Gly He Asn Asn Gin Gin Leu Ser Val Leu Asp 
370 375 380 

Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

Tyr Arg Lys Ser Gly Thr Val Asp Ser I«eu Asp Glu He Pro Pro Gin 
405 410 415 

Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 

Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser He He 
435 440 445 

Arg Ala Pro Met Phe Ser Trp Thr His Arg Ser Ala Thr Pro Thr Asn 
450 455 460 

Thr He Asp Pro Glu Arg He Thr Gin He Pro Leu Val Lys Ala His 
465 470 475 480 

Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 
485 490 495 

Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 
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Val Asn lie Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
515 520 525 

Tyr Ala Ser Thr Thr Asn Leu Arg He Tyr Val Thr Val Ala Gly Glu 
530 535 540 

Arg He Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 
545 550 555 560 

Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 
565 570 575 

Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 

Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 605 

Ala Thr Leu Glu Ala Glu Tyr Asn Leu Glu Arg Ala Gin Lys Ala Val 
610 615 620 

Asn Ala Leu Phe Thr Ser Thr Asn Gin .Leu Gly Leu Lys Thr Asn Val 
625 630 635 640 

Thr Asp Tyr His He Asp Gin Val Ser Asn Leu Val Thr Tyr Leu Ser 
645 650 655 

Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 6.65 670 

His Ala Lys Arg Leu Ser Asp Glu Arg Asn I^u Leu Gin Asp Ser Asn 
675 .680 685 

Phe Lys Asp He Asn Arg Gin Pro Glu Arg Gly Trp Gly Gly Ser Thr 
690 695 700 

Gly He Thr He Gin Gly Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 720 

Thr Leu Ser Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 

Lys He Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 745 750 

Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr 
755 760 765 

Asn Ala Lyd His Glu Thr Val Asn Val Pro Gly Thr Gly Ser Leu Trp 
770 775 780 

Pro Leu Ser Ala Gin Ser Pro He Gly Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 
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Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 
805 810 815 



Asp Gly Glu Lys Cys Ala His His Ser His His Phe Ser Leu Asp lie 
820 825 830 

Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val lie 
835 840 845 

Phe Lys lie Lys Thr Gin Asp Gly His Ala Arg Leu Gly Asn Leu Glu 
850 855 860 

Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 

Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg dlu Lys Leu Glu Trp Glu 
885 890 895 

Thr Asn lie Val Tyr Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe 
900 905 910 

Val Asn Ser Gin Tyr Asp Gin Leu Gin Ala Asp Thr Asn He Ala Met 
915 920 925 

He His Ala Ala Asp Lys Arg Val His Ser He Arg Glu Ala Tyr Leu 
930 935 940 

Pro Glu Leu Ser Val He Pro Gly Val Asn Ala Ala He Phe Glu Glu 
945 950 955 960 

Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 
965 970 975 

Val He Lys Asn Gly Asp Phe Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 

Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 

Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 

Pro Gly Arg Gly Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 

Gly Glu Gly Cys Val Thr He His Glu He Glu Asn Asn Thr Asp Glu 
1045 1050 1055 

Leu Lys Phe Ser Asn Cys Val Glu Glu Glu He Tyr Pro Asn Asn Thr 
1060 1065 1070 

Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 1080 1085 
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Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 

Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
5 1105 1110 1115 1120 

Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 

10 Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 

Lys Val Trp lie Glu lie Gly Glu Thr Glu Gly Thr Phe lie Val Asp 
X155 1160 1165 

15 .... 

Ser Val Glu Leu Leu Leu Met Glu gIu 
1170 1175 



20 (2) INFORMATION FOR SEQ ID N0:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3531 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
30 (B) LOCATION: 1..3531 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATG GAT AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 48 
35 Met Asp Asn Asn Pro Asn lie Asn Glu Cys lie Pro Tyr Asn Cys Leu 
15 10 15 

AGT AAC CCT GAA GTA GAA GTA TTA GGT GGA GAA AGA ATA GAA ACT GGT 96 
Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg lie Glu Thr Gly 
40 20 25 30 

TAC AOC CCA ATC GAT ATT TCC TTG TCG CTA ACG CAA TTT CTT TTG AGT 144 
Tyr Thr Pro He Asp He Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 



45 



GAA TTT GTT CCC GGT GCT GGA TTT GTG TTA GGA CTA GTT GAT ATA ATA 192 
Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 
50 55 60 



50 TGG GGA ATT TTT GGT CCC TCT CAA TGG GAC GCA TTT CTT GTA CAA ATT 240 

Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 
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GAA CAG TTA ATT AAC CAA AGA ATA GAA GAA TTC GCT AGG AAC CAA GCC 
Glu Gin Leu He Asn Gin Ar^ He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 

ATT TCT AGA TTA GAA GGA CTA AGC AAT CTT TAT CAA ATT TAC GCA GAA 
He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin lie Tyr Ala Glu 
100 X05 110 ' 

TCT TTT AGA GAG TGG GAA GCA GAT CCT ACT AAT CCA GCA TTA AGA GAA 
Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
lis 120 125 

GAG ATG CGT ATT CAA TTC AAT GAC ATG AAC AGT GCC CTT ACA ACC GCT 
Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 

ATT CCT CTT TTT GCA GTT CAA AAT TAT CAA GTT CCT CTT TTA TCA GTA 
He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
145 150 155 160 

TAT GTT CAA GCT GCA AAT TTA CAT TTA TCA GTT TTG AGA GAT GTT TCA 
Tyr Val Gin Ala Ala Asn Leu His Leu ,Ser Val Leu Arg Asp Val Ser 
165 170 175 

GTG TTT GGA CAA AGG TGG GGA TTT GAT GCC GCG ACT ATC AAT AGT CGT 
Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
180 18iS 190 

TAT AAT GAT TTA ACT AGG CTT ATT GGC AAC TAT ACA GAT CAT GCT GTA 
Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp His Ala Val 
195 200 205 

CGC TGG TAC AAT ACG GGA TTA GAG CGT GTA TGG GGA CCG GAT TCT AGA 
Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 

GAT TGG ATA AGA TAT AAT CAA TTT AGA AGA GAA TTA ACA CTA ACT GTA 
Asp Trp He Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 24D 

TTA GAT ATC GTT TCT CTA TTT CCG AAC TAT GAT AGT AGA ACG TAT CCA 
Leu Asp He Val Ser Leu Phe Pro Asn Tyr Asp Ser Arg Thr Tyr Pro 
245 250 255 

ATT CGA ACA GTT TCC CAA TTA ACA AGA GAA ATT TAT ACA AAC CCA GTA 
He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asn Pro Val 
260 265 270 

TTA GAA AAT TTT GAT GGT AGT TTT CGA GGC TCG GCT CAG GGC ATA GAA 
Leu Glu Asn Phe Asp Gly Ser Phe Arg Gly Ser Ala Gin Gly He Glu 
275 280 285 
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GGA AGT ATT AGG ACT CCA CAT TTG ATG GAT ATA CTT AAC AGT ATA ACC 912 
Gly Ser lie Arg Ser Pro His Leu Met Asp lie Leu Asn Ser lie Thr 
290 295 300 

5 ATC TAT ACG GAT GCT CAT AGA GGA GAA TAT TAT TGG TCA GGG CAT CAA 960 

lie Tyr Thr Asp Ala His Arg Gly Glu Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

ATA ATG GCT TCT CCT GTA GGG TTT TCG GGG CCA GAA TTC ACT TTT CCG 1008 
10 lie Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 

325 330 335 

CTA TAT GGA ACT ATG GGA AAT GCA GCT CCA CAA CAA CGT ATT GTT GCT 1056 
Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
15 ^ 340 345 350 

CAA CTA GGT CAG GGC GTG TAT AGA ACA TTA TCG TCC ACT TTA TAT AGA 1104 
Gin Leu Gly Gin Gly Val Tyr. Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

20 

AGA CCT TTT AAT ATA GGG ATA AAT AAT CAA CAA CTA TCT GTT CTT GAC 1152 
Arg Pro Phe Asn tie Gly He Asn Asn. Gin Gin Leu Ser Val Leu Asp 
370 375 380 

25 GGG ACA GAA TTT GCT TAT GGA ACC TCC TCA AAT TTG CCA TCC GCT GTA 1200 

Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

TAC AGA AAA AGC GGA ACG GTA GAT TCG CTG GAT GAA ATA CCG CCA CAG 1248 
30 Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu He Pro Pro Gin 

405 410 415 

AAT AAC AAC GTG CCA CCT AGG CAA GGA TTT AGT CAT CGA TTA AGC CAT 1296 
Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
35 420 425 430 

GTT TCA ATG TTT CGT TCA GGC TTT AGT AAT AGT AGT GTA AGT ATA ATA 1344 
Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser He He 
435 440 445 

40 

AGA GCT CCA ATG TTT TCT TGG ACG CAC CGT AGT GCA ACC CCT ACA AAT 1392 
Arg Ala Pro Met Phe Ser Trp Thr His Arg Ser Ala Thr Pro Thr Asn 
450 455 460 

45 ACA ATT GAT CCG GAG AGG ATT ACT CAA ATA CCA TTG GTA AAA GCA CAT 1440 

Thr He Asp Pro Glu Arg He Thr Gin He Pro Leu Val Lys Ala His 
465 470 475 480 

ACA CTT CAG TCA GGT ACT ACT GTT GTA AGA GGG CCC GGG TTT ACG GGA 1488 
50 Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 

485 490 495 
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GGA GAT ATT CTT CGA CGA ACA AGT GGA GGA CCA TTT GCT TAT ACT ATT 1536 
Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 

5 GTT AAT ATA AAT GGG CAA TTA CCC CAA AGG TAT CGT GCA AGA ATA CGC 1584 

Val Asn He Asn Gly Gin Leu Pro Gin- Arg Tyr Arg Ala Arg He Arg 
515 520 525 

TAT GCC TCT ACT ACA AAT CTA AGA ATT TAC GTA ACG GTT GCA GOT GAA 1632 
10 Tyr Ala Ser Thr Thr Asn Leu Arg He Tyr Val Thr Val Ala Gly Glu 
530 535 540 

CGG ATT TTT GCT GGT CAA TTT AAC AAA ACA ATG GAT ACC GGT GAC CCA 1680 
Arg He Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 
15 545 550 555 560 

TTA ACA TTC CAA TCT TTT AGT TAC GCA ACT ATT AAT ACA GCT TTT ACA 1728 
Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 
565 570 575 

20 

TTC CCA ATG AGC CAG AGT AGT TTC ACA GTA GGT GCT GAT ACT TTT AGT 1776 
Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 

25 TCA GGG AAT GAA GTT TAT ATA GAC AGA TTT GAA TTG ATT CCA GTT ACT 1824 

Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 605 

GCA ACA TTT GAA GCA GAA TAT GAT TTA GAA AGA GCA CAA AAG GCG GTG 1872 
30 Ala Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg Ala Gin Lys Ala Val 
610 615 620 

AAT GCG CTG TTT ACT TCT ATA AAC CAA ATA GGG ATA AAA ACA GAT GTG 1920 
Asn Ala Leu Phe Thr Ser He Asn Gin He Gly He Lys Thr Asp Val 
35 625 630 635 640 

ACG GAT TAT CAT ATT GAT CAA GTA TCC AAT TTA GTG GAT TGT TTA TCA 1968 
Thr Asp Tyr His He Asp Gin Val Ser Asn Leu Val Asp Cys Leu Ser 
645 650 655 



40 



GAT GAA TTT TGT CTG GAT GAA AAG CGA GAA TTG TCC GAG AAA GTC AAA 2016 
Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser . Glu Lys Val Lys 
660 665 670 



45 CAT GCG AAG CGA.CTC AGT GAT GAG CGG AAT TTA CTT CAA GAT CCA AAC 2064 

His Ala Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn 
675 680 685 

TTC AAA GGC ATC AAT AGG CAA CTA GAC CGT GGT TGG AGA GGA AGT ACG 2112 
50 Phe Lys Gly He Asn Arg Gin Leu Asp Arg Gly Trp Arg Gly Ser Thr 
690 695 700 



A l<»77q(2<NMBOM.OOC) 



-138- 



) 



> 



GAT ATT ACC ATC CAA AGA GGA GAT GAC GTA TTC AAA GAA AAT TAT GTC 
Asp lie Thr lie Gin Arg Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 720 

ACA CTA CCA GGT ACC TTT GAT GAG TGC TAT CCA ACA TAT TTG TAT CAA 
Thr Leu Pro Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 

AAA ATC GAT GAA TCA AAA TTA AAA GCC TTT ACC CGT TAT CAA TTA AGA 
Lys lie Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 . 745 750 

GGG TAT ATC GAA GAT AGT CAA GAC TTA GAA ATC TAT TTA ATT CGC TAC 
Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr 
755 760 765 

AAT GCA AAA CAT GAA ACA GTA AAT GTG CCA GGT ACG GGT TCC TTA TGG 
Asn Ala Lys His Glu Thr Val Asn Val Pro Gly Thr Gly Ser Leu Trp 
770 775 780 

CCG CTT TCA GCC CAA AGT CCA ATC GGA AAG TGT GGA GAG CCG AAT CGA 
Pro Leu Ser Ala Gin Ser Pro He Gly 'Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 

TGC GCG CCA CAC CTT GAA TGG AAT CCT GAC TTA GAT TGT TCG TGT AGG 
Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 
805 810 815 

GAT GGA GAA AAG TGT GCC CAT CAT TCG CAT CAT TTC TCC TTA GAC ATT 
Asp Gly Glu Lys Cys Ala His His Ser His His Phe Ser Leu Asp He 
820 825 830 

GAT GTA GGA TGT ACA GAC TTA AAT GAG GAC CTA GGT GTA TGG GTG ATC 
Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val He 
835 840 845 

TTT AAG ATT AAG ACG CAA GAT GGG CAC GCA AGA CTA GGG AAT CTA GAG 
Phe Lys He Lys Thr Gin Asp Gly His Ala Arg Leu Gly Asn Leu Glu 
850 855 860 

TTT CTC GAA GAG AAA CCA TTA GTA GGA GAA GCG CTA GCT CGT GTG AAA 
Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 

AGA GCG GAG AAA AAA TGG AGA GAC AAA CGT GAA AAA TTG GAA TGG GAA 
Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys Leu Glu Trp Glu 
885 890 895 

ACA AAT ATC GTT TAT AAA GAG GCA AAA GAA TCT GTA GAT GCT TTA TTT 
Thr Asn He Val Tyr Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe 
900 905 910 

GTA AAC TCT CAA TAT GAT CAA TTA CAA GCG GAT ACG AAT ATT GCC ATG 
Val Asn Ser Gin Tyr Asp Gin Leu Gin Ala Asp Thr Asn He Ala Met 
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915 



920 



925 



ATT CAT GCG GCA GAT AAA CGT GTT CAT AGC ATT CGA GAA GCT TAT CTG 
He His Ala Ala Asp Lys Arg Val His Ser lie Arg Glu Ala Tyr Leu 
930 935 940 



2832 



10 



CCT GAG CTG TCT GTG ATT CCG GGT GTC AAT GCG GCT ATT TTT GAA GAA 2880 
Pro Glu Leu Ser Val He Pro Gly Val Asn Ala Ala lie Phe Glu Glu 
945 950 955 960 

TTA GAA GGG CGT ATT TTC ACT GCA TTC TCC CTA TAT GAT GCG AGA AAT 2928 
Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 
965 970 975 



15 GTC ATT AAA AAT GGT GAT TTT AAT AAT GGC TTA TCC TGC TGG AAC GTG 
Val tie Lys Asn Gly Asp Ph^ Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 



2976 



AAA GGG CAT GTA GAT GTA GAA GAA CAA AAC AAC CAA CGT TCG GTC CTT 
20 Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 



3024 



25 



GTT GTT CCG GAA TGG GAA GCA GAA GTG TCA CAA GAA GTT CGT GTC TGT 
Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 



3072 



30 



CCG GGT CGT GGC TAT ATC CTT CGT GTC ACA GCG TAC AAG GAG GGA TAT 3120 
Pro Gly Arg Gly Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 

GGA GAA GGT TGC GTA ACC ATT CAT GAG ATC GAG AAC AAT ACA GAC GAA 3168 
Gly Glu Gly Cys Val Thr He His Glu He Glu Asn Asn Thr Asp Glu 
1045 1050 1055 



35 CTG AAG TTT AGC AAC TGC GTA GAA GAG GAA ATC TAT CCA AAT AAC ACG 

Leu Lys Phe Ser Asn Cys Val Glu Glu Glu He Tyr Pro Asn Asn Thr 
1060 1065 1070 



3216 



GTA ACG TGT AAT GAT TAT ACT GTA AAT CAA GAA GAA TAC GGA GGT GCG 
40 Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 1080 1085 



3264 



45 



TAC ACT TCT CGT AAT CGA GGA TAT AAC GAA GCT CCT TCC GTA CCA GCT 
Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 



3312 



50 



GAT TAT GOG TCA GTC TAT GAA GAA AAA TCG TAT ACA GAT GGA CGA AGA 3360 
Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1105 1110 1115 1120 

GAG AAT CCT TGT GAA TTT AAC AGA GGG TAT AGG GAT TAC ACG CCA CTA 3408 
Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 
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CCA GTT GGT TAT GTG ACA AAA GAA TTA GAA TAG TTC CCA GAA ACC GAT 
Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 



AAG GTA TGG ATT GAG ATT GGA GAA ACG GAA GGA ACA TTT ATC GTG GAC 
Lys Val Trp lie Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1155 1160 1165 

AGC GTG GAA TTA CTC CTT ATG GAG GAA 
Ser Val Glu Leu Leu Leu Met Glu Glu 
1170 1175 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUraCE CHARACTERISTICS: 

(A) LENGTH: 1177 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyr Asn Cys Leu 
1 5 .10 15 

Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg He Glu Thr Gly 
20 25 30 

Tyr Thr Pro He Asp He Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 
50 55 60 

Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 

Glu Gin lieu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 

He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
100 105 110 

Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
115 120 125 

Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 

He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
145 150 155 160 



A. IOS77^29MBOI< OOQ 



-141- 



1 



Tyr Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
180 185 190 

Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp His Ala Val 
195 200 205 

Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 

Asp Trp He Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

Leu Asp lie Val Ser Leu Phe Pro Ash Tyr Asp Ser Arg Thr Tyr Pro 
245 250 255 

He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asn Pro Val 
260 265 270 

Leu Glu Asn Phe Asp Gly Ser Phe Arg'Gly Ser Ala Gin Gly He Glu 
275 280 285 

Gly Ser He Arg Ser Pro His Leu Met Asp He Leu Asn Ser He Thr 
290 295 300 

He Tyr Thr Asp Ala His Arg Gly Glu Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325 330 335 

Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
340 345 350 

Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

Arg Pro Phe Asn He Gly He Asn Asn Gin Gin Leu Ser Val Leu Asp 
370 375 380 

Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 .390 395 400 

Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu He Pro Pro Gin 
405 410 415 

Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 

Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser He He 
435 440 445 
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Arg Ala Pro Met Phe Ser Trp Thr His Arg Ser Ala Thr Pro Thr Asn 
450 455 460 



Thr He Asp Pro Glu Arg He Thr Gin lie Pro Leu Val Lys Ala His 
465 470 475 480 

Thr Leu Gin iSer Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 
485 490 495 

Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 

Val Asn He Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
515 520 525 

Tyr Ala Ser Thr Thr Asri Leu Arg He 'tyr Val Thr Val Ala Gly Glu 
530 535 540 

Arg He Phe Ala Gly Gin Phe. Asn Lys Thr Met Asp Thr Gly Asp Pro 
545 550 555 560 

Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 
565 570 575 

Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 

Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 605 

Ala Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg Ala Gin Lys Ala Val 
610 615 620 

Asn Ala Leu Phe Thr Ser He Asn Gin He Gly He Lys Thr Asp Val 
625 630 635 640 

Thr Asp Tyr His He Asp Gin Val Ser Asn Leu Val Asp Cys Leu Ser 
645 650 655 

Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 665 670 

His Ala Lys Arg Leu Ser Asp. Glu Arg Asn Leu Leu Gin Asp Pro Asn 
675 680 685 

Phe Lys Gly He Asn Arg Gin Leu Asp Arg Gly Trp Arg Gly Ser Thr 
690 695 700 

Asp He Thr He Gin Arg Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 720 

Thr Leu Pro Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 
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Lys He Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 . 745 750 



Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr 
755 760 765 

Asn Ala Lys His Glu Thr Val Asn Val Pro Gly Thr Gly Ser Leu Trp 
770 775 780 

Pro Leu Ser Ala Gin Ser Pro He Gly Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 

Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 
805 810 815 

Asp Gly Glu Lys Cys Ala His His Ser His His Phe* Ser Leu Asp He 
820 825 830 

Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val He 
835 840 845 

Phe Lys He Lys Thr Gin Asp Gly His 'Ala Arg Leu Gly Asn Leu Glu 
850 855 860 

Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 

Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys Leu Glu Trp Glu 
885 890 895 

Thr Asn He Val Tyr Lys Glu TQa Lys Glu Ser Val Asp Ala Leu Phe 
900 905 910 

Val Asn Ser Gin Tyr Asp Gin Leu Gin Ala Asp Thr Asn He Ala Met 
915 920 925 

He His Ala Ala Asp Lys Arg Val His Ser He Arg Glu Ala Tyr Leu 
930 935 940 

Pro Glu Leu Ser Val He Pro Gly Val Asn Ala Ala He Phe Glu Glu 
945 950 955 960 

Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 
965 970 975 

Val He Lys Asn Gly Asp Phe Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 

Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 

Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 
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Pro Gly Arg Gly Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 

Gly Glu Gly Cys Val Thr He His Glu He Glu Asn Asn Thr Asp Glu 
1045 1050 1055 

Leu Lys Phe Ser Asn Cys Val Glu Glu Glu He Tyr Pro Asn Asn Thr 
1060 1065 i070 

Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 1080 1085 

Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 

Asp Tyr Ala Ser Val Tyr Glu Glu Lys Set Tyr Thr Asp Gly Arg Arg 
1105 1110 1115 1120 

Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 

Pro val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 

Lys Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1155 1160 1165 

Ser Val Glu Leu Leu Leu Met Glu Glu 
1170 1175 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TATCCAATTC GAACGTCATC 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
TTTAGTCATC GATTAAATCA 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
ATAATAAGAG CTCCAATGTT 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18 
TACATCGTAG TGCAACTCTT 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 
TCATGGAlGAG CTCCTATGTT 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
TTAACAAGAG CTCCTATGTT 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21 
ACTACCAGiST ACCTTTGATG 



(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22 
ACTACCGGGT ACCTTTGATA 



(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
ATTTGA6TAA TACTATCC 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
ATTACTCAAA TACCATTGG 



(2) INFORMATION FOR SEQ ID NO: 25: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3534 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 
^ (B) LOCATION: i;.3531 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

ATG GAT AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 
Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyr Asn Cys Leu 
1 5 - 10 " 15 

AGT AAC CCT GAA GTA GAA GTA TTA GGT GGA GAA AGA ATA GAA ACT GGT 
Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg He Glu Thr Gly 
20 25 30 



TAC ACC CCA ATC GAT ATT TCC TTG TCG CTA ACG CAA TTT CTT TTG AGT 
Tyr Thr Pro He Asp He Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

GAA TTT GTT CCC GGT GCT GGA TTT GTG TTA GGA CTA GTT GAT ATA ATA 
Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 
50 55 60 

TGG GGA ATT TTT GGT CCC TCT CAA TGG GAC GCA TTT CTT GTA CAA ATT 
Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 

GAA CAG TTA ATT AAC CAA AGA ATA GAA GAA TTC GCT AGG AAC CAA GCC 
Glu Gin Leu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 



ATT TCT AGA TTA GAA GGA CTA AGC AAT CTT TAT CAA ATT TAC GCA GAA 
He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
100 105 110 

TCT TTT AGA GAG TGG GAA GCA GAT CCT ACT AAT CCA GCA TTA AGA GAA 
Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
115 120 125 

GAG ATG CGT ATT CAA TTC AAT GAC ATG AAC AGT GCC CTT ACA ACC GCT 
Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 
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ATT CCT CTT TTT GCA GTT CAA AAT TAT CAA GTT CCT CTT TTA TCA GTA 
lie Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
145 150 155 160 

TAT GTT CAA GCT GCA AAT TTA CAT TTA TCA GTT TTG AGA GAT GTT TCA 
Tyx Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

GTG TTT GGA CAA AGG TGG GGA TTT GAT GCC GCG ACT ATC AAT AGT CGT 
Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
180 185 190 

TAT AAT GAT TTA ACT AGG CTT ATT GGC AAC TAT ACA GAT CAT GCT GTA 
Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp His Ala Val 
IsiS 200 ' -205 

CGC TGG TAC AAT ACG GGA TTA GAG CGT GTA TGG GGA CCG GAT.TCT AGA 
Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 

GAT TGG ATA AGA TAT AAT CAA TTT AGA 'AGA GAA TTA ACA CTA ACT GTA 
Asp Trp He Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

TTA GAT ATC GTT TCT CTA TTT CCG AAC TAT GAT AGT AGA ACG TAT CCA 
Leu Asp He Val Ser Leu Phe Pro Asn Tyr Asp Ser Arg Thr Tyr Pro 
245 250 255 

ATT CGA ACA GTT TCC CAA TTA ACA AGA GAA ATT TAT ACA AAC CCA GTA 
He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asn Pro Val 
260 265 270 

TTA GAA AAT TTT GAT GGT AGT TTT CGA GGC TCG GCT CAG GGC ATA GAA 
Leu Glu Asn Phe Asp Gly Ser Phe Arg Gly Ser Ala Gin Gly He Glu 
275 280 285 

AGA AGT ATT AGG AGT CCA CAT TTG ATG GAT ATA CTT AAC AGT ATA ACC 
Arg Ser He Arg Ser Pro His Leu Met Asp He Leu Asn Ser He Thr 
290 295 300 

ATC TAT ACG GAT GCT CAT AGG GGT TAT TAT TAT TGG TCA GGG CAT CAA 
He Tyr Thr Asp Ala His Arg Gly Tyr Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

ATA ATG GCT TCT CCT GTA GGG TTT TCG GGG CCA GAA TTC ACT TTT CCG 
He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325 330 335 

CTA TAT GGA ACT ATG GGA AAT GCA GCT CCA CAA CAA CGT ATT GTT GCT 
Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
340 345 350 
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CAA CTA GGT CAG GGC GTG TAT AGA ACA TTA TCG TCC ACT TTA TAT AGA 1104 
Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

5 AGA OCT TTT AAT ATA GGG ATA AAT AAT CAA CAA CTA TCT GTT CTT GAC 1152 

Arg Pro Phe Asn lie Gly lie Asn Asn Gin Gin Leu Ser Val Leu Asp 
370 375 380 

GGG ACA GAA TTT GCT TAT GGA ACC TCC TCA AAT TTG CCA TCC GCT GTA 1200 
10 Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

TAC AGA AAA AGC GGA ACG GTA GAT TCG CTG GAT GAA ATA CCG CCA CAG 1248 
Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu lie Pro Pro Gin 
15 405 410 415 

AAT AAC AAC GTG CCA CCT AGG CAA GGA TTT AGT CAT CGA TTA AGC CAT 1296 
Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 



20 



GTT TCA ATG TTT CGT TCA GGC TTT AGT AAT AGT AGT GTA AGT ATA ATA 1344 
Val Ser Met Phe Arg Ser Gly Phe Ser 'Asn Ser Ser Val Ser lie lie 
435 440 445 



25 AGA GCT CCA ATG TTT TCT TGG ACG CAC CGT AGT GCA ACC CCT ACA AAT 1392 

Arg Ala Pro Met Phe Ser Trp Thr His Arg Ser Ala Thr Pro Thr Asn 
450 455 460 

ACA ATT GAT CCG GAG AGG ATT ACT CAA ATA CCA TTG GTA AAA GCA CAT 1440 
30 Thr He Asp Pro Glu Arg He Thr Gin He Pro Leu Val Lys Ala His 
465 470 475 480 

ACA CTT CAG TCA GGT ACT ACT GTT GTA AGA GGG CCC GGG TTT ACG GGA 1488 
Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 
35 485 490 495 

GGA GAT ATT CTT CGA CGA ACA AGT GGA GGA CCA TTT GCT TAT ACT ATT 1536 
Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 

40 

GTT AAT ATA AAT GGG CAA TTA CCC CAA AGG TAT CGT GCA AGA ATA CGC 1584 
Val Asn He Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
515 520 525 

45 TAT GCC TCT ACT ACA AAT CTA AGA ATT. TAC GTA ACG GTT GCA GGT GAA 1632 

Tyr Ala Ser Thr Thr Asn Leu Arg He Tyr Val Thr Val Ala Gly Glu 
530 535 540 

CGG ATT TTT GCT GGT CAA TTT AAC AAA ACA ATG GAT ACC GGT GAC CCA 1680 
50 Arg He Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 
545 550 555 560 
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TTA ACA TTC CAA TCT TTT AGT TAG GCA ACT ATT AAT ACA OCT TTT ACA 
Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 
565 570 575 

TTC CCA ATG AGC CAG AGT AGT TTC ACA GTA GGT OCT GAT ACT TTT AGT 
Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 

TCA GGG AAT GAA GTT TAT ATA GAC AGA TTT GAA TTG ATT CCA GTT ACT 
Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 605 

GCA ACA* TTT GAA GCA GAA TAT GAT TTA GAA AGA GCA CAA AAG GCG GTG 
Ala Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg Ala Gin Lys Ala Val 
610 615 620 

AAT GCG CTG TTT ACT TCT ATA AAC CAA ATA GGG ATA AAA ACA GAT GTG 
Asn Ala Leu Phe Thr Ser He Asn Gin He Gly He Lys Thr Asp Val 
625 630 635 640 

ACG GAT TAT CAT ATT GAT CAA GTA TCC AAT TTA GTG GAT TGT TTA TCA 
Thr Asp Tyr His He Asp Gin Vai Ser Asn Leu Val Asp Cys Leu Ser 
645 650 655 

GAT GAA TTT TGT CTG GAT GAA AAG CGA GAA TTG TCC GAG AAA GTC AAA 
Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 665 670 

CAT GCG AAG CGA CTC AGT GAT GAG CGG AAT TTA CTT CAA GAT CCA AAC 
His Ala Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn 
675 680 685 

TTC AAA GGC ATC AAT AGG CAA CTA GAC CGT GGT TGG AGA GGA AGT ACG 
Phe Lys Gly He Asn Arg Gin Leu Asp Arg Gly Trp Arg Gly Ser Thr 
690 695 700 

GAT ATT ACC ATC CAA AGA GGA GAT GAC GTA TTC AAA GAA AAT TAT GTC 
Asp He Thr He Gin Arg Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 7^0 

ACA CTA CCA GGT ACC TTT GAT GAG TGC TAT CCA ACA TAT TTG TAT CAA 
Thr Leu Pro Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 

AAA ATC GAT GAA TCA AAA TTA AAA GCC TTT ACC CGT TAT CAA TTA AGA 
Lys He Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 745 750 

GGG TAT ATC GAA GAT AGT CAA GAC TTA GAA ATC TAT TTA ATT CGC TAC 
Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr 
755 760 765 
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AAT GCA AAA CAT GAA ACA GTA AAT GTG CCA GGT ACG GGT TCC TTA TGG 2352 
Asn Ala Lys His Glu Thr Val Asn Vai Pro Gly Thr Gly Ser Leu Trp 
770 775 780 

5 CCG CTT TCA GCC CAA AGT CCA ATC GGA AAG TGT GGA GAG CCG AAT CGA 2400 

Pro Leu Ser Ala Gin Ser Pro lie Gly Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 

TGC GCG CCA CAC CTT GAA TGG AAT CCT GAC TTA GAT TGT TCG TGT AGG 2448 
10 Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 

805 810 815 

GAT GGA GAA AAG TGT GCC CAT CAT TCG CAT CAT TTC TCC TTA GAC ATT 2496 
Asp Gly Glu Lys Cys Ala His His Ser His His Phe Ser Leu Asp lie 
15 820 825 830 

GAT GTA GGA TGT ACA GAC TTA AAT GAG GAC CTA GGT GTA TGG GTG ATC 2 544 

Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val lie 
835 840 845 

20 

TTT AAG ATT AAG ACG CAA GAT GGG CAC GCA AGA CTA GGG AAT CTA GAG 2592 
Phe Lys lie Lys Thr Gin Asp Gly His -Ala Arg Leu Gly Asn Leu Glu 
850 855 860 

25 TTT CTC GAA GAG AAA CCA TTA GTA GGA GAA GCG CTA GCT CGT GTG AAA 2640 

Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 

AGA GCG GAG AAA AAA TGG AGA GAC AAA CGT GAA AAA TTG GAA TGG GAA 2688 
30 Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys Leu Glu Trp Glu 

885 890 895 

ACA AAT ATC GTT TAT AAA GAG GCA AAA GAA TCT GTA GAT GCT TTA TTT 2736 
Thr Asn lie Val Tyr Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe 
35 900 905 910 

GTA AAC TCT CAA TAT GAT CAA TTA CAA GCG GAT ACG AAT ATT GCC ATG 2784 
Val Asn Ser Gin Tyr Asp Gin Leu Gin Ala Asp Thr Asn lie Ala Met 
915 920 925 

40 

ATT CAT GCG GCA GAT AAA CGT GTT CAT AGC ATT CGA GAA GCT TAT CTG 2832 
lie His Ala Ala Asp Lys Arg Val His Ser lie Arg Glu Ala Tyr Leu 
930 935 940 

45 CCT GAG CTG TCT GTG ATT CCG GGT GTC AAT GCG GCT ATT TTT GAA GAA 2880 

Pro Glu Leu Ser Val lie Pro Gly Val Asn Ala Ala lie Phe Glu Glu 
945 950 955 960 

TTA GAA GGG CGT ATT TTC ACT GCA TTC TCC CTA TAT GAT GCG AGA AAT 2928 
50 Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 

965 970 975 
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GTC ATT AAA AAT GGT GAT TTT AAT AAT GGC TTA TCC TGC TGG AAC GTG 
Val He Lys Asn Gly Asp Phe Asn Asri Gly Leu Ser Cys Trp Asn Val 
980 985 990 

AAA GGG CAT GTA GAT GTA GAA GAA CAA AAC AAC CAA CGT TCG GTC CTT 
Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 



GTT GTT CCG GAA TGG GAA GCA GAA GTG TCA CAA GAA GTT CGT GTC TGT 
Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 



CCG GGT CGT GGC TAT ATC CTT CGT GTC ACA GCG TAC AAG GAG GGA TAT 
Pro Gly Arg Gly Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 

GGA GAA GGT TGC GTA ACC ATT CAT GAG ATC GAG AAC AAT ACA GAC GAA 
Gly Glu Gly Cys Val Thr He His Glu He Glu Asn Asn Thr Asp Glu 
1045 1050 1055 



CTG AAG TTT AGC AAC TGC GTA GAA GAG GAA ATC TAT CCA AAT AAC ACG 
Leu Lys Phe Ser Asn Cys Val Glu Glu Glu He Tyr Pro Asn Asn Thr 
1060 1065 1070 

GTA ACG TGT AAT GAT TAT ACT GTA AAT CAA GAA GAA TAC GGA GGT GCG 
Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 1080 1085 



TAC ACT TCT CGT AAT CGA GGA TAT AAC GAA GCT CCT TCC GTA CCA GCT 
Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 

GAT TAT GCG TCA GTC TAT GAA GAA AAA TCG TAT ACA GAT GGA CGA AGA 
Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1105 1110 1115 1120 

GAG AAT CCT TGT GAA TTT AAC AGA GGG TAT AGG GAT TAC ACG CCA CTA 
Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 



CCA GTT GGT TAT GTG ACA AAA GAA TTA GAA TAC TTC CCA GAA ACC GAT 
Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 

AAG GTA TGG ATT GAG ATT GGA GAA ACG GAA GGA ACA TTT ATC GTG GAC 
Lys Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1155 1160 1165 



AGC GTG GAA TTA CTC CTT ATG GAG GAA TAG 
Ser Val Glu Leu Leu Leu Met Glu Glu 
1170 1175 
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(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1177 amino acids 
S (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Asp Asn Asn Pro Asn lie Asn Glu Cys lie Pro Tyr Asn Cys Leu 
15 10 15 

15 Ser Asn Pro Glu Val Glu Val Leu Gly.Gly Glu Arg lie Glu Thr Gly 
- 20 25 . 30 



Tyr Thr Pro lie Asp lie Ser Leu Ser Leu Thr Gin Phe Leu. Leu Ser 
35 40 45 

Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu. Val Asp lie lie 
50 55 ' 60 



Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
25 65 70 75 80 

Glu Gin Leu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 

30 He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
100 105 110 



Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
115 120 125 

Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 



He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
40 145 150 155 160 

Tyr Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

45 Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
180 185 190 



Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp His Ala Val 
195 200 205 

Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 
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Asp Trp He Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

Leu Asp He Val Ser Leu Phe Pro Asn Tyr Asp Ser Arg Thr Tyr Pro 
245 250 255 

He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asn Pro Val 
260 265 270 

Leu Glu Asn Phe Asp Gly Ser Phe Arg Gly Ser Ala Gin Gly He Glu 
275 280 285 

Arg Ser He Arg Ser Pro His Leu Met Asp He Leu Asn Ser He Thr 
290 295 300 

He Tyr Tte Asp kla His Arg Gly Tyr Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325. 330 335 

Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
340 345 350 

Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

Arg Pro Phe Asn He Gly He Asn Asn Gin Gin Leu Ser Val Leu Asp 
370 375 380 

Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu He Pro Pro Gin 
405 410 415 



Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 

Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser He He 
435 440 445 

Arg Ala Pro Met Phe Ser Trp Thr His Arg Ser Ala Thr Pro Thr Asn 
450 455 460 

Thr He Asp Pro Glu Arg He Thr Gin He Pro Leu Val Lys Ala His 
465 470 475 480 

Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 
485 490 495 

Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 
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Val Asn He Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
515 520 525 

Tyr Ala Ser Thr Thr Asn Leu Arg He Tyr Val Thr Val Ala Gly Glu 
5 530 535 540 

Arg He Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 
545 550 555 560 

10 Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 

565 570 575 



15 



30 



45 



Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 

' Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 605 



Ala Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg Ala Gin Lys Ala Val 
20 610 615 620 

Asn Ala Leu Phe Thr Ser He Asn Gin' He Gly He Lys Thr Asp Val 
625 630 635 640 

25 Thr Asp Tyr His He Asp Gin Val Ser Asn Leu Val Asp Cys Leu Ser 

645 650 655 



Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 665 670 

His Ala Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn 
675 • 680 685 



Phe Lys Gly He Asn Arg Gin Leu Asp Arg Gly Trp Arg Gly Ser. Thr 
35 690 695 700 

Asp He Thr He Gin Arg Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 720 

40 Thr Leu Pro Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 

725 730 735 



Lys He Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 745 750 

Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr 
755 760 765 



Asn Ala Lys His Glu Thr Val Asn Val Pro Gly Thr Gly Ser Leu Trp 
50 770 775 780 

Pro Leu Ser Ala Gin Ser Pro He Gly Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 
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Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 
805 810 815 



Asp Gly Glu Lys Cys Ala His His Ser His His Phe Ser Leu Asp lie 
820 825 830 

Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val lie . 
835 840 845 

Phe Lys He Lys Thr Gin Asp Gly His Ala Arg Leu Gly Asn Leu Glu 
850 855 860 

Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 

Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys Leu Glii Trp Glu 
885 890 895 

Thr Asn He Val Tyr Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe 
900 905 910 

Val Asn Ser Gin Tyr Asp Gin I*eu Gin 'Ala Asp Thr Asn He Ala Met 
915 920 925 

He His Ala Ala Asp Lys Arg Val His Ser He Arg Glu Ala Tyr Leu 
930 935 940 

Pro Glu Leu Ser Val He Pro Gly Val Asn Ala Ala He Phe Glu Glu 
945 950 955 960 

Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 
965 970 975 

Val lie Lys Asn Gly Asp Phe Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 

Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 

Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 

Pro Gly Arg Gly Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 

Gly Glu Gly Cys Val Thr He His Glu He Glu Asn Asn Thr Asp Glu 
1045 1050 1055 

Leu Lys Phe Ser Asn Cys Val Glu Glu Glu He Tyr Pro Asn Asn Thr 
1060 1065 1070 

Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 1080 1085 
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Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 

Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1105 1110 1115 1120 

Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 

Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 

Lys Val Trp lie Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1155 1160 1165 

Ser Val Glu Leu Leu Leu Met Glu Glti 
1170 1175 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: - 

(A) LENGTH: 3534 base pairs 

(B) TVPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..3531 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

ATG GAT AAC AAT CCG AAC ATC AAT GAA TGC ATT CCT TAT AAT TGT TTA 
Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyr Adn Cys Leu 
1 5 10 15 

AGT AAC CCT GAA GTA GAA GTA TTA GGT GGA GAA AGA ATA GAA ACT GGT 
Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg He Glu Thr Gly 
20 25 30 

TAG ACC CCA ATC GAT ATT TCC TTG TCG CTA ACG CAA TTT CTT TTG AGT 
Tyr Thr Pro He Asp He Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

GAA TTT GTT CCC GGT GCT GGA TTT GTG TTA GGA CTA GTT GAT ATA ATA 
Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 
50 55 60 

TGG GGA ATT TTT GGT CCC TCT CAA TGG GAC GCA TTT CTT GTA CAA ATT 
Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 
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GAA CAG TTA ATT AAC CAA AGA ATA GAA GAA TTC GCT AGG AAC CAA GCC 
Glu Gin Leu lie Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 

ATT TCT AGA TTA GAA GGA CTA AGC AAT CTT TAT CAA ATT TAG GCA GAA 
He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin lie Tyr Ala Glu 
100 105 110 

TCT TTT AGA GAG TGG GAA GCA GAT CCT ACT AAT CCA GCA TTA AGA GAA 
Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
lis 120 125 

GAG ATG CGT ATT CAA TTC AAT GAC ATG AAC AGT GCC CTT ACA ACC GCT 
Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 

.130 , 135 . 140 . 

ATT CCT CTT TTT GCA GTT CAA AAT TAT CAA GTT CCT CTT TTA TCA GTA 
He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu. Ser Val 
145 150 155 160 

TAT GTT CAA GCT GCA AAT TTA CAT TTA TCA GTT TTG AGA GAT GTT TCA 
Tyr Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

GTG TTT GGA CAA AGG TGG GGA TTT GAT GCC GCG ACT ATC AAT AGT CGT 
Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
180 185 190 

TAT AAT GAT TTA ACT AGG CTT ATT GGC AAC TAT ACA GAT TAT GCT GTA 
Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp Tyr Ala Val 
195 200 205 

CGC TGG TAC AAT ACG GGA TTA GAA CGT GTA TGG GGA CCG GAT TCT AGA 
Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 

GAT TGG GTA AGG TAT AAT CAA TTT AGA AGA GAA TTA ACA CTA ACT GTA 
Asp Trp Val Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

TTA GAT ATC GTT GCT CTG TTC CCG AAT TAT GAT AGT AGA AGA TAT CCA 
Leu Asp He Val Ala Leu Phe Pro Asn Tyr Asp Ser Arg Arg Tyr Pro 
245 250 255 

ATT CGA ACA GTT TCC CAA TTA ACA AGA GAA ATT TAT ACA AAC CCA GTA 
He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asn Pro Val 
260 265 270 

TTA GAA AAT TTT GAT GGT AGT TTT CGA GGC TCG GCT CAG GGC ATA GAA 
Leu Glu Asn Phe Asp Gly Ser Phe Arg Gly Ser Ala Gin Gly He Glu 
275 280 285 
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AGA AGT ATT AGG AGT CCA CAT TTG ATG GAT ATA CTT AAC AGT ATA ACC 
Arg Ser He Arg Ser Pro His Leu Met Asp He Leu Asn Ser He Thr 
290 295 300 

ATC TAT ACG GAT GCT CAT AGG GGT TAT TAT TAT TGG TCA GGG CAT CAA 
He Tyr Thr Asp Ala His Arg Gly Tyr Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

ATA ATG GCT TCT CCT GTA GGG TTT TCG GGG CCA GAA TTC ACT TTT CCG 
He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325 330 335 

CTA TAT GGA ACT ATG GGA AAT GCA GCT CCA CAA CAA CGT ATT GTT GCT 
Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
340 345 350 

CAA CTA GGT CAG GGC GTG TAT AGA ACA TTA TCG TCC ACT TTA TAT AGA 
Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

AGA CCT TTT AAT ATA GGG ATA AAT AAT CAA CAA CTA TCT GTT CTT GAC 
Arg Pro Phe Asn He Gly He Asn Asn 'Gin Gin Leu Ser Val Leu Asp 
370 375 380 

GGG ACA GAA TTT GCT TAT GGA ACC TCC TCA AAT TTG CCA TCC GCT GTA 
Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

TAC AGA AAA AGC GGA ACG GTA GAT TCG CTG GAT GAA ATA CCG CCA CAG 
Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu He Pro Pro Gin 
405 410 415 

AAT AAC AAC GTG CCA CCT AGG CAA GGA TTT AGT CAT CGA TTA AGC CAT 
Asn Asxi Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 

GTT TCA ATG TTT CGT TCA GGC TTT AGT AAT AGT AGT GTA AGT ATA ATA 
Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser He lie 
435 440 445 

AGA GCT CCT ATG TTC TCT TGG ATA CAT CGT AGT GCT GAA TTT AAT AAT 
Arg Ala Pro Met Phe Ser Trp He His Arg Ser Ala Glu Phe Asn Asn 
450 455 460 

ATA ATT GCA TCG GAT AGT ATT ACT CAA ATA CCA TTG GTA AAA GCA CAT 
He He Ala Ser Asp Ser He Thr Gin He Pro Leu Val Lys Ala His 
465 470 475 480 

ACA CTT CAG TCA GGT ACT ACT GTT GTA AGA GGG CCC GGG TTT ACG GGA 
Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr . Gly 
485 490 495 
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GGA GAT ATT CTT CGA CGA ACA AGT GGA GGA CCA TTT GCT TAT ACT ATT 
Gly Asp lie Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr lie 
500 SOS SlO 



GTT AAT ATA AAT GGG CAA TTA CCC CAA AGG TAT CGT GCA AGA ATA CGC 
Val Asn lie Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
515 S20 52S 

TAT GCC TCT ACT ACA AAT CTA AGA ATT TAC GTA ACG GTT GCA GGT GAA 
Tyr Ala Ser Thr Thr Asn Leu Arg He Tyr Val Thr Val Ala Gly Glu 
530 535 540 



CGG ATT TTT GCT GGT CAA TTT AAC AAA ACA ATG GAT ACC GGT GAC CCA 
Arg lie Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 

545 550 555 560 



TTA ACA TTC CAA TCT TTT AGT TAC GCA ACT ATT AAT ACA GCT TTT ACA 
Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 
565 570 575 



TTC CCA ATG AGC CAG AGT AGT TTC ACA GTA GGT GCT GAT ACT TTT AGT 
Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 



TCA GGG AAT GAA GTT TAT ATA GAC AGA TTT GAA TTG ATT CCA GTT ACT 
Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 605 

GCA ACA TTT GAA GCA GAA TAT GAT CTA GAA AGA GCA CAA AAG GCG GTG 
Ala Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg Ala Gin Lys Ala Val 
610 615 620 

AAT GCG CTG TTT ACT TCT ATA AAC CAA ATA GGG ATA AAA ACA GAT GTG 
Asn Ala Leu Phe Thr Ser He Asn Gin He Gly He Lys Thr Asp Val 
625 630 635 640 



ACG GAT TAT CAT ATT GAT CAA GTA TCC AAT TTA GTG GAT TGT TTA TCA 
Thr Asp Tyr His He Asp Gin Val Ser Asn Leu Val Asp Cys Leu Ser 
645 650 655 



GAT GAA TTT TGT CTG GAT GAA AAG CGA GAA TTG TCC GAG AAA GTC AAA 
Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 665 670 



CAT GGG AAG CGA CTC AGT GAT GAG CGG AAT TTA CTT CAA GAT CCA AAC 
His Ala Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn 
675 680 685 

TTC AAA GGC ATC AAT AGG CAA CTA GAC CGT GGT TGG AGA GGA AGT ACG 
Phe Lys Gly He Asn Arg Gin Leu Asp Arg Gly Trp Arg Gly Ser Thr 
690 695 700 
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GAT ATT ACC ATC CAA AGA GGA GAT GAC GTA TTC AAA GAA AAT TAT GTC 2160 
Asp lie Thr He Gin Arg Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 720 

5 ACA CTA CCA GGT ACC TTT GAT GAG TGC TAT CCA ACA TAT TTG TAT CAA 2208 

Thr Leu Pro Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 

. AAA ATC GAT GAA TCA AAA TTA AAA GCC TTT ACC CGT TAT CAA TTA AGA 2256 
10 Lys He Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 745 750 

GGG TAT ATC GAA GAT AGT CAA GAC TTA GAA ATC TAT TTA ATT CGC TAC 2304 
Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr 
15 , . 755 760 765 

AAT GCA AAA CAT GAA ACA GTA AAT GTG CCA GGT ACG GGT TCC TTA TGG 2352 
Asn Ala Lys His Glu Thr Val Asn Val Pro Gly Thr Gly Ser Leu Trp 
770 775 780 

20 

CCG CTT TCA GCC CAA AGT CCA ATC GGA AAG TGT GGA GAG CCG AAT CGA 2400 
Pro Leu Ser Ala Gin Ser Pro He Gly 'Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 

25 TGC GCG CCA CAC CTT GAA TGG AAT CCT GAC TTA GAT TGT TCG TGT AGG 2448 

Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 
805 810 815 

GAT GGA GAA AAG TGT GCC CAT CAT TCG CAT CAT TTC TCC TTA GAC ATT 2496 
30 Asp Gly Glu Lys Cys Ala His His Ser His His Phe Ser Leu Asp He 
820 825 830 

GAT GTA GGA TGT ACA GAC TTA AAT GAG GAC CTA GGT GTA TGG GTG ATC 2544 
Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val He 
35 835 840 845 

TTT AAG ATT AAG ACG CAA GAT GGG CAC GCA AGA CTA GGG AAT CTA GAG 2592 
Phe Lys He Lys Thr Gin Asp Gly His Ala Arg Leu Gly Asn Leu Glu 
850 855 860 

40 

TTT CTC GAA GAG AAA CCA TTA GTA GGA GAA GCG CTA GCT CGT GTG AAA 2640 
Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 

45 AGA GCG GAG AAA AAA TGG AGA GAC AAA CGT GAA AAA TTG GAA TGG GAA 2688 

Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys Leu Glu Trp Glu 
885 890 895 

ACA AAT ATC GTT TAT AAA GAG GCA AAA GAA TCT GTA GAT GCT TTA TTT 2736 
50 Thr Asn He Val Tyr Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe 
900 905 910 

GTA AAC TCT CAA TAT GAT CAA TTA CAA GCG GAT ACG AAT ATT GCC ATG 2784 
Val Asn Ser Gin Tyr Asp Gin Leu Gin Ala Asp Thr Asn He Ala Met 
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ATT CAT GCG GCA GAT AAA CGT GTT CAT AGC ATT CGA GAA GCT TAT CTG 
He His Ala Ala Asp Lys Arg Val His Ser He Arg Glu Ala Tyr Leu 
930 935 940 

CCT GAG CTG TCT GTG ATT CCG GGT GTC AAT GCG GCT ATT TTT GAA GAA 
Pro Glu Leu Ser Val He Pro Gly Val Asn Ala Ala He Phe Glu Glu 
945 950 955 960 

TTA GAA GGG CGT ATT TTC ACT GCA TTC TCC CTA TAT GAT GCG AGA AAT 
Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 
965 970 975 

GTC ATT AAA AAT GGT GAT TTT AAT AAT GGC TTA TCC TGC TGG AAC GTG 
Val He Lys Asn Gly Asp Phe Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 

AAA GGG CAT GTA GAT GTA GAA GAA CAA AAC AAC CAA CGT TCG GTC CTT 
Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 

GTT GTT CCG GAA TGG GAA GCA GAA GTG TCA CAA GAA GTT CGT GTC TGT 
Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 

CCG GGT CGT GGC TAT ATC CTT CGT GTC ACA GCG TAC AAG GAG GGA TAT 
Pro Gly Arg Gly Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 

GGA GAA GGT TGC GTA ACC ATT CAT GAG ATC GAG AAC AAT ACA GAC GAA 
Gly Glu Gly Cys Val Thr He His Glu He Glu Asn Asn Thr Asp Glu 
1045 1050 1055 

CTG AAG TTT AGC AAC TGC GTA GAA GAG GAA ATC TAT CCA AAT AAC ACG 
Leu Lys Phe Ser Asn Cys Val Glu Glu Glu He Tyr Pro Asn Asn Thr 
1060 1065 1070 

GTA ACG TGT AAT GAT TAT ACT GTA AAT CAA GAA GAA TAC GGA GGT GCG 
Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 1080 1085 

TAC ACT TCT CGT AAT CGA GGA TAT AAC GAA GCT CCT TCC GTA CCA GCT 
Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 

GAT TAT GCG TCA GTC TAT GAA GAA AAA TCG TAT ACA GAT GGA CGA AGA 
Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1105 1110 1115 1120 

GAG AAT CCT TGT GAA TTT AAC AGA GGG TAT AGG GAT TAC ACG CCA CTA 
Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 
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CCA GTT GGT TAT GTG ACA AAA GAA TTA GAA TAG TTC CCA GAA ACC GAT 
Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 



AAG GTA TGG ATT GAG ATT GGA GAA ACG GAA GGA ACA TTT ATC GTG GAC 
Lys Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1155 1160 1165 

AGC GTG GAA TTA CTC CTT ATG GAG GAA TAG 
Ser Val Glu Leu Leu Leu Met Glu Glu 
1170 1175 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1177 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyr Asn Cys Leu 
1 5 .10 15 

Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg He Glu Thr Gly 
20 25 30 

Tyr Thr Pro He Asp He Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly I^eu Val Asp He He 
50 55 60 

Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 

Glu Gin Leu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 

He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
100 105 110 

Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro* Ala Leu Arg Glu 
115 120 125 

Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 

He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
145 150 155 160 
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Tyr Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
5 180 185 190 

Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp Tyr Ala Val 
195 200 205 

10 Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 

Asp Trp Val Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

15 . . ■ . - 

Leu Asp He Val Ala Leu Phe Pro Asn Tyr Asp Ser Arg Arg Tyr Pro 
245 250 255 

He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asn Pro Val 
20 260 265 270 

Leu Glu Asn Phe Asp Gly Ser Phe Arg Gly Ser Ala Gin Gly He Glu 
275 280 285 

25 Arg Ser He Arg Ser Pro His Leu Met Asp He Leu Asn Ser He Thr 
290 295 300 

He Tyr Thr Asp Ala His Arg Gly Tyr Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

30 

He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325 330 335 

Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
35 340 345 350 

Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

40 Arg Pro Phe Asn He Gly He Asn Asn Gin Gin Leu Ser Val Leu Asp 
370 375 380 

Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

45 

Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu He Pro Pro Gin 
405 410 415 

Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
SO 420 425 430 

Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser He He 
435 440 445 
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Arg Ala Pro Met Phe Ser Trp lie His Arg Ser Ala Glu Phe Asn Asn 
450 455 460 

He lie Ala Ser Asp Ser He Thr Gin lie Pro Leu Val Lys Ala His 
5 465 470 475 480 

Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 
485 490 495 

10 Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 



15 



30 



45 



Val Asn He Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
515 520 525 

Tyr Ala Ser Thr Thr Asn Leu Arg He Tyr Val Thr Val Ala Gly Glu 
530 535 540 



Arg He Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 
20 545 550 555 560 

Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 
565 570 575 

25 Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 



Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 • 605 

Ala Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg Ala Gin Lys Ala Val 
610 615' 620 



Asn Ala Leu Phe Thr Ser He Asn Gin He Gly He Lys Thr Asp Val 
35 625 630 635 640 

Thr Asp Tyr His He Asp Gin Val Ser Asn Leu Val Asp Cys Leu Ser 
645 650 655 

40 Asp Glu Phe Cys I«eu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 665 670 



His AX:a Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn 
675 680 685 

Phe Lys Gly He Asn Arg Gin Leu Asp Arg Gly Trp Arg Gly Ser Thr 
690 695 700 



Asp He Thr He Gin Arg Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 

50 705 710 715 720 

Thr Leu Pro Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 
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Lys lie Asp Glu Ser Lys Leu Lys 
740 

Gly Tyr lie Glu Asp Ser Gin Asp 
755 760 

Asn Ala Lys His Glu Thr Val Asn 
770 775 

Pro Leu Ser Ala Gin Ser Pro lie 
785 790 



Ala Phe Thr Arg Tyr Gin Leu Arg 
745 750 

Leu Glu He Tyr Leu He Arg Tyr 
765 

Val Pro Gly Thr Gly Ser Leu Trp 
780 

Gly Lys Cys Gly Glu Pro Asn Arg 
795 800 



Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 
805 810 815 



Asp Gly Glu Lys Cys Ala His His Ser His His Phe Ser Leu Asp He 
820 825 830 



Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val lie 
835 840 845 

Phe Lys lie Lys Thr Gin Asp Gly His 'Ala Arg Leu Gly Asn I^u Glu 
850 855 860 

Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 



Arg Ala Glu Lys Lys Trp Arg Asp 
885 

Thr Asn He Val Tyr Lys Glu Ala 
900 

Val Asn Ser Gin Tyr Asp Gin Leu 
915 920 

He His Ala Ala Asp Lys Arg Val 
930 935 

Pro Glu Leu Ser Val He Pro Gly 
945 950 



Lys Arg Glu Lys Leu Glu Trp Glu 
890 895 

Lys Glu Ser Val Asp Ala Leu Phe 
905 910 

Gin Ala Asp Thr Asn He Ala Met 
925 

His Ser He Arg Glu Ala Tyr lieu 
940 

Val Asn Ala Ala He Phe Glu Glu 
955 960 



Leu Glu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 
965 970 975 

Val He Lys Asn Gly Asp Phe Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 

Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 

Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 
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Pro Gly Arg Gly Tyr lie Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 

Gly Glu Gly Cys Val Thr lie His Glu lie Glu Asn Asn Thr Asp Glu 
S 1045 1050 1055 

Leu Lys Phe Ser Asn Cys Val Glu Glu Glu lie Tyr Pro Asn Asn Thr 
1060 1065 1070 

10 Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 lOeO 1085 

Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 

15 

Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1105 1110 1115 1120 

Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
20 1125 1130 1135 

Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 

25 Lys Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1155 1160 1165 

Ser Val Glu Leu Leu Leu Met Glu Glu 
1170 1175 

30 

(2) INFORMATION FOR SBQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 3579 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATGGATAACA ATCCGAACAT CAAT6AATGC ATTCCTTATA ATTGTTTAAG TAACCCTGAA 60 

GTAGAAGTAT TAGGTGGAGA AAGAATAGAA ACTGGTTACA CCCCAATCGA TATTTCCTTG 120 

45 

TCGCTAACGC AATTTCTTTT GAGTGAATTT GTTCCCGGTG CTGGATTTGT GTTAGGACTA 180 

GTTGATATAA TATGGGGAAT TTTTGGTCCC TCTCAATGGG ACGCATTTCT TGTACAAATT 240 

50 GAACAGTTAA TTAACCAAAG AATAGAAGAA TTCGCTAGGA ACCAAGCCAT TTCTAGATTA 300 

GAAGGACTAA GCAATCTTTA TCAAATTTAC GCAGAATCTT TTAGAGAGTG GGAAGCAGAT 360 

CCTACTAATC CAGCATTAAG AGAAGAGATG CGTATTCAAT TCAATGACAT GAACAGTGCC 420 
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CTTACAACCG CTATTCCTCT TTTTGCAGTT CAAAATTATC AAGTTCCTCT TTTATCAGTA 480 

TATGTTCAAG CTGCAAATTT ACATTTATCA GTTTTGAGAG ATGTTTCAGT GTTTGGACAA 540 

5 

AGGTGGGGAT TTGATGCCGC GACTATCAAT AGTCGTTATA ATGATTTAAC TAGGCTTATT 600 

GGCAACTATA CAGATTATGC TGTACGCTGG TACAATACGG GATTAGAACG TGTATGGGGA 660 

10 CCGGATTCTA GAGATTGGGT AAGGTATAAT CAATTTAGAA GAGAATTAAC ACTAACTGTA 720 

TTAGATATCG TTGCTCTGTT CCCGAATTAT GATAGTAGAA GATATCCAAT TCGAACAGTT 780 

TCCCAATTAA CAAGAGAAAT TTATACAAAC CCAGTATTAG AAAATTTTGA TGGTAGTTTT 840 

15 . 

CGAGGCTCGG CTCAGGGCAT AGAAAGAAGT ATTAGGAGTC CACATTTGAT GGATATACTT 900 

AACAGTATAA CCATCTATAC GGATGCTCAT AGGGGTTATT ATTATTGGTC AGGGCATCAA 960 

20 ATAATGGCTT CTCCTGTAGG GTTTTCGGGG CCAGAATTCA CTTTTCCGCT ATATGGAACT 1020 

ATGGGAAATG CAGCTCCACA ACAACGTATT GTTGCTCAAC TAGGTCAGGG CGTGTATAGA 1080 

ACATTATCGT CCACTTTATA TAGAAGACCT TTTAATATAG GGATAAATAA TCAACAACTA 1140 

25 

TCTGTTCTTG ACGGGACAGA ATTTGCTTAT GGAACCTCCT CAAATTTGCC ATCCGCTGTA 1200 

TACAGAAAAA GCGGAACGGT AGATTCGCTG GATGAAATAC CGCCACAGAA TAACAACGTG 1260 

30 CCACCTAGGC AAGGATTTAG TCATCGATTA AGCCATGTTT CAATGTTTCG TTCAGGCTTT 1320 

AGTAATAGTA GTGTAAGTAT AATAAGAGCT CCTATGTTCT CTTGGATACA TCGTAGTGCA 1380 

ACTCTTACAA ATACAATTGA TCCAGAGAGA ATTAATCAAA TACCTTTAGT GAAAGGATTT 1440 

35 

AGAGTTTGGG GGGGCACCTC TGTCATTACA GGACCAGGAT TTACAGGAGG GGATATCCTT 1500 

CGAAGAAATA CCTTTGGTGA TTTTGTATCT CTACAAGTCA ATATTAATTC ACCAATTACC 1560 

40 CAAAGATACC GTTTAAGATT TCGTTACGCT TCCAGTAGGG ATGCACGAGT TATAGTATTA 1620 

ACAGGAGCGG CATCCACAGG AGTGGGAGGC CAAGTTAGTG TAAATATGCC TCTTCAGAAA 1680 

ACTATGGAAA TAGGGGAGAA CTTAACATCT AGAACATTTA GATATACCGA TTTTAGTAAT 1740 

45 

CCTTTTTCAT TTAGAGCTAA TCCAGATATA ATTGGGATAA GTGAACAACC TCTATTTGGT 1800 

GCAGGTTCTA TTAGTAGCGG TGAACTTTAT ATAGATAAAA TTGAAATTAT TCTAGCAGAT 1860 

50 GCAACATTTG AAGCAGAATC TGATTTAGAA AGAGCACAAA AGGCX3GTGAA TGCCCTGTTT 1920 

ACTTCTTCCA ATCAAATCGG GTTAAAAACC GATGTGACGG ATTATCATAT TGATCAAGTA 1980 

TCCAATTTAG TGGATTGTTT ATCAGATGAA TTTTGTCTGG ATGAAAAGCG AGAATTGTCC 2040 
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GAGAAAGTCA AACATGCGAA GCGACTCAGT 
TTCAGAGGGA TCAATAGACA ACCAGACCGT 

5 

CAAGGAGGAG ATGACGTATT CAAAGAGAAT 
TGCTATCCAA CGTATTTATA TCAGAAAATA 
10 TATGAATTAA GAGGGTATAT CGAAGATAGT 

AATGCAAAAC ACGAAATAGT AAATGTGCCA 
CAAAGTCCAA TCGGAAAGTG TGGAGAACCG 

15 - . . ■ . 

CCTGATCTAG ATTGTTCCTG CAGAGACGGG 
ACCTTGGATA TTGATGTTGG ATGTACAGAC 
20 TTCAAGATTA AGACGCAAGA TGGCCATGCA 

AAACCATTAT TAGGGGAAGC ACTAGCTCGT 
AAACGAGAGA AACTGCAGTT GGAAACAAAT 

25 

GATGCTTTAT TTGTAAACTC TCAATATGAT 
ATTCATGCGG CAGATAAACG CGTTCATAGA 
30 GTGATTCCAG GTGTCAATGC GGCCATTTTC 

TATTCCTTAT ATGATX3CGAG AAATGTCATT 
TGCTGGAACG TGAAAGGTCA TGTAGATGTA 

35 

GTTATCCCAG AATGGGAGGC AGAAGTGTCA 
TATATCCTTC GTGTCACAGC ATATAAAGAG 
40 GAGATCGAAG ACAATACAGA CGAACTGAAA 

CCAAACAACA CAGTAACGTG TAATAATTAT 
TACACTTCTC GTAATCAAGG ATATGACGAA 

45 

GATTACGCTT CAGTCTATGA AGAAAAATCG 
GAATCTAACA GAGGCTATGG GGATTACACA 
50 TTAGAGTACT TCCCAGAGAC CGATAAGGTA 

TTCATCGTGG ATAGCGTGGA ATTACTCCTT 



GATGAGCGGA ATTTACTTCA AGATCCAAAC 2100 

GGCTGGAGAG GAAGTACAGA TATTACCATC 2160 

TACGTCACAC TACCGGGTAC CGTTGATGAG 2220 

GATGAGTCGA AATTAAAAGC TTATACCCGT 2280 

CAAGACTTAG AAATCTATTT GATCCGTTAC 2340 

GGCACGGGTT CCTTATGGCC GCTTTCAGCC 2400 

AATCGATGCG CGCCACACCT TGAATGGAAT 2460 

GAAAAATGTG CACATCATTC CCATCATTTC" 2520 

TTAAATGAGG ACTTAGGTGT ATGGGTGATA 2580 

AGACTAGGGA ATCTAGAGTT TCTCGAAGAG 2640 

GTGAAAAGAG CGGAGAAGAA GTGGAGAGAC 2700 

ATTGTTTATA AAGAGGCAAA AGAAtCTGTA 2760 

AGATTACAAG TGGATACGAA CATCGCAATG 2820 

ATCCGGGAAG CGTATCTGCC AGAGTTGTCT 2880 

GAAGAATTAG AGGGACGTAT TTTTACAGCG 2940 

AAAAATGGCG ATTTCAATAA TGGCTTATTA 3000 

GAAGAGCAAA ACAACCACCG TTCGGTCCTT 3060 

CAAGAGGTTC GTGTCTGTCC AGGTCGTGGC 3120 

GGATATGGAG AGGGCTGCGT AACGATCCAT 3180 

TTCAGCAACT GTGTAGAAGA GGAAGTATAT 3240 

ACTGGGACTC AAGAAGAATA TGAGGGTACG 3300 

GCCTATGGTA ATAACCCTTC CGTACCAGCT 3360 

TATACAGATG GACGAAGAGA GAATCCTTGT 3420 

CCACTACCGG CTGGTTATGT AACAAAGGAT 3480 

TGGATTGAGA TCGGAGAAAC AGAAGGAACA 3540 

ATGGAGGAA 3579 
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10 
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40 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1193 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Met Asp Asn Asn Pro Asn He Asn Glu Cys He Pro Tyx Asn Cys Leu 
15 10 15 



Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg He Glu Thr Gly 
15 20 . 25 .30 

Tyr Thr Pro He Asp He Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

20 Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 

50 55 60 



Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe Leu Val Gin He 
65 70 75 80 

Glu Gin Leu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn Gin Ala 
85 90 95 



He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
30 100 105 110 

Ser Phe Arg Glu Trp Glii Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
115 120 125 

35 Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr. Thr Ala 

130 135 140 



He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser val 
145 150 155 160 

Tyr Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 



Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
45 180 185 190 

Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp Tyr Ala Val 
195 200 205 

50 Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 

210 215 220 

Asp Trp Val Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 
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Leu Asp lie Val Ala Leu Phe Pro Asn Tyr Asp Ser Arg Arg Tyr Pro 
245 250 255 

lie Arg Thr Val Ser Gin Leu Thr Arg Glu lie Tyr Thr Asn Pro Val 
260 265 270 

Leu Glu Asn Phe Asp Gly Ser Phe Arg Gly Ser Ala Gin Gly lie Glu 
275 280 285 

Arg Ser lie Arg Ser Pro His Leu Met Asp lie Leu Asn Ser lie Thr 
290 295 300 

lie Tyr Thr Asp Ala His Arg Gly Tyr Tyr Tyr Trp Ser Gly His Gin 
305 . , . 310 . . 315 320 

He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325 330 335 

Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
340 345 350 

Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

Arg Pro Phe Asn He Gly He Asn Asn Gin Gin Leu Ser Val Leu Asp 
370 375 380 

Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

Tyr Arg Lys Ser. Gly Thr Val Asp Ser I*eu Asp Glu He Pro Pro Gin 
405 410 415 

Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 

Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser lie He 
435 440 445 

Arg Ala Pro Met Phe Ser Trp He His Arg Ser Ala Thr Leu Thr Asn 
450 455 460 

Thr He Asp Pro Glu Arg He Asn Gin He Pro Leu Val Lys Gly Phe 
465 470 475 480 

Arg Val Trp Gly Gly Thr Ser Val He Thr Gly Pro Gly Phe Thr Gly 
485 490 495 

Gly Asp He Leu Arg Afg Asn Thr Phe Gly Asp Phe Val Ser Leu Gin 



Val Asn He Asn Ser Pro He Thr Gin Arg Tyr Arg Leu Arg Phe Arg 



500 



505 



510 



515 



520 



525 
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Tyr Ala Ser Ser Arg Asp Ala Arg val lie Val Leu Thr Gly Ala Ala 
530 535 540 



Ser Thr Gly Val Gly Gly Gin Val Ser Val Asn Met Pro Leu Gin Lys 
545 550 555 560 

Thr Met Glu He Gly Glu Asn Leu Thr Ser Arg Thr Phe Arg Tyr Thr 
565 570 575 

Asp Phe Ser Asn Pro Phe Ser Phe Arg Ala Asn Pro Asp He He Gly 
580 585 590 

He Ser Glu Gin Pro Leu Phe Gly Ala Gly Ser He Ser Ser Gly Glu 
595 600 605 

Leu Tyr He Asp Lys He Glu He He Leu Ala Asp Ala Thr Phe Glu 
610 615 620 

Ala Glu Ser Asp Leu Glu Arg Ala Gin Lys Ala Val Asn Ala Leu Phe 
625 630 635 640 

Thr Ser Ser Asn Gin He Gly Leu Lys Thr Asp Val Thr Asp Tyr His 
645 650 655 

He Asp Gin Val Ser Asn Leu Val Asp Cys Leu Ser Asp Glu Phe Cys 
660 665 670 

Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys His Ala Lys Arg 
675 680 685 

Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn Phe Azxr Gly He 
.690 695 700 

Asn Arg Gin Pro Asp Arg Gly Trp Arg Gly Ser Thr Asp He Thr He 
70S 710 715 720 

Gin Gly Gly Asp Asp Val Phe Lys Glu Asn Tyr Val Thr Leu Pro Gly 
725 730 7^5 

Thr Val Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin Lys He Asp Glu 
740 745 750 

Ser Lys Leu Lys Ala Tyr Thr Arg Tyr Glu Leu Arg Gly Tyr He Glu 
755 760 765 

Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr Asn Ala Lys His 
770 775 780 

Glu He Val Asn Val Pro Gly Thr Gly Ser Leu Trp Pro Leu Ser Ala 
785 790 795 800 

Gin Ser Pro He Gly Lys Cys Gly Glu Pro Asn Arg Cys Ala Pro His 
805 810 815 
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Leu Glu Trp Asn Pro Asp Leu Asp 
820 

Cys Ala His His Ser His His Phe 
835 840 

Thr Asp Leu Asn Glu Asp Leu Gly 
850 855 

Thr Gin Asp Gly His Ala Arg Leu 
865 870 

Lys Pro Leu Leu Gly Glu Ala Leu 
885 

Lys Trp Arg Asp Lys Arg Glu Lys 
900 

Tyr Lys 



Tyr Asp 
930 

Asp Lys Arg Val His Arg lie Arg 
945 950 

Val lie Pro Gly Val Asn Ala Ala 
965 



Cys Ser Cys Arg Asp Gly Glu Lys 
825 830 

Thr Leu Asp lie Asp Val Gly Cys 
845 

Val Trp Val He Phe Lys He Lys 
860 

Gly Asn Leu Glu Phe Leu Glu Glu 
875 880 

Ala Arg Val Lys Arg Ala Glu Lys 
890 895 

Leu Gin Leu Glu Thr Asn He Val 
905 910 



Glu Ala Tyr Ijeu Pro Glu Leu Ser 
955 960 

He Phe Glu Glu Leu Glu Gly Arg 
970 975 



Glu Ala Lys Glu Ser Val Asp Ala Leu Phe Val Asn Ser Gin 
915 920 925 

Arg Leu Gin Val Asp Thr Asn He Ala Met He His Ala Ala 
935 940 



He Phe Thr Ala Tyr Ser Leu Tyr Asp Ala Arg Asn Val He Lys Asn ' 
980 985 990 

Gly Asp Phe Asn Asn Gly Leu Leu Cys Trp Asn Val Lys Gly His Val 
995 1000 1005 

Asp Val Glu Glu Gin Asn Asn His Arg Ser Val Leu Val He Pro Glu 
1010 1015 1020 

Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys Pro Gly Arg Gly 
1025 1030 1035 1040 

Tyr He Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr Gly Glu Gly Cys 
1045 1050 1055 

Val Thr He His Glu He Glu Asp Asn Thr Asp Glu Leu Lys Phe Ser 
1060 1065 1070 

Asn Cys Val Glu Glu Glu Val Tyr Pro Asn Asn Thr Val Thr Cys Asn 
1075 1080 1085 

Asn Tyr Thr Gly Thr Gin Glu Glu Tyr Glu Gly Thr Tyr Thr Ser Arg 
1090 1095 1100 



-174- 



A t0S779(29MB0t< OOCI 



Asn Gin Gly Tyr Asp Glu Ala Tyr Gly Asn Asn Pro Ser Val Pro Ala 
1105 1110 1115 1120 

Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1125 1130 1135 

Glu Asn Pro Cys Glu Ser Asn Arg Gly Tyr Gly Asp Tyr Thr Pro Leu 
1140 1145 1150 

Pro Ala Gly Tyr Val Thr Lys Asp Leu Glu Tyr Phe Pro Glu Thr Asp 
1155 1160 1165 

Lys Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1170 . 1175 1180 

Ser Val Glu Leu Leu Leu Met Glu Glu 
1185 1190 



(2) INFORMATION FOR SEQ ID NO: 31': 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 31: 
CGTTGCTCTG TTCCCG 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
TCAAATACCA TTGGTAAAAG 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3534 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

ATGGATAACA ATCCGAACAT CAATGAATGC ATTCCTTATA ATTGTTTAAG TAACCCTGAA , 60 

GTAGAAGTAT TAGGTGGAGA AAGAATAGAA ACTGGTTACA CCCCAATCGA TATTTCCTTG 120 

TCGCTAACGC AATTTCTTTT GAGTGAATTT GTTCCCGGTG CTGGATTTGT GTTAGGACTA 180 

GTTGATATAA TATGGGGAAT TTTTGGTCCC TCTCAATGGG ACGCATTTCT TGTACAAATT 240 

GAACAGTTAA TTAACCAAAG AATAGAAGAA TTCGCTAGGA ACCAAGCCAT TTCTAGATTA 300 

GAAGGACTAA GCAATCTTTA TCAAATTTAC GCAGAATCTT TTAGAGAGTG GGAAGCAGAT 360 

15 . . CCTACTAATC CAGCATTAAG AGAAGAGATG CGTATTCAAT TCAATQJICAT GAACAGTGCC 420 

CTTACAACCG CTATTCCTCT TTTTGCAGTT CAAAATTATC AAGTTCCTCT TTTATCAGTA 480 

TATGTTCAAG CTGCAAATTT ACATTTATCA GTTTT6A6A6 ATGTTTCAGT GTTTGGACAA 540 

20 

AGGTGGGGAT TTGATGCCGC GACTATCAAT AGTCGTTATA ATGATTTAAC TAGGCTTATT 600 

GGCAACTATA CAGATTATGC TGTACGCTGG TACAATACGG GATTAGAACG TGTATGGGGA 660 

25 CCGGATTCTA 6AGATTGGGT AAGGTATAAT CAATTTAGAA GAGAATTAAC ACTAACTGTA 720 

TTAGATATCG TTGCTCTGTT CCCGAATTAT GATAGTAGAA GATATCCAAT TCGAACAGTT 780 

TCCCAATTAA CAAGAGAAAT TTATACAAAC" CCAGTATTAG AAAATTTTGA TGGTAGTTTT 840 

30 

CGAGGCTCGG CTCAGGGCAT AGAAAGAAGT ATTAGGAGTC CACATTTGAT GGATATACTT 900 

AACAGTATAA CCATCTATAC GGATGCTCAT AGGGGTTATT ATTATTGGTC AGGGCATCAA 960 

35 ATAATGGCTT CTCCTGTAGG GTTTTCGGGG CCAGAATTCA CTTTTCCGCT ATATGGAACT 1020 

ATGGGAAATG CAGCTCCACA ACAACGTATT GTTGCTCAAC TAGGTCAGGG CGTGTATAGA 1080 

ACATTATCGT CCACTTTATA TAGAAGACCT TTTAATATAG GGATAAATAA TCAACA^iCTA 1140 

40 

TCTGTTCTTG ACGGGACAGA ATTTGCTTAT GGAACCTCCT CAAATTTGCC ATCCGCTGTA 1200 

TACASAAAAA GCGGAACGGT AGATTCGCTG GATGAAATAC CGCCACAGAA TAACAACGTG 1260 

45 CCACCTAGGC AAGGATTTAG TCATCGATTA AGCCATGTTT CAATGTTTCG TTCAGGCTTT 1320 

AGTAATMTA 6TGTAAGTAT AATAAGAGCT CCTATGTTCT CTTGGATACA TCGTAGTGCT 1380 

GAATTTAATA ATATAATTGC ATCGGATAGT ATTACTCAAA TACCATTGGT AAAAGCACAT 1440 

ACACTTCAGT CAGGTACTAC TGTTGTAAGA GGGCCCGGGT TTACGGGAGG AGATATTCTT 1500 

CGACGAACAA GTGGAGGACC ATTTGCTTAT ACTATTGTTA ATATAAATGG GCAATTACCC 1560 



-176- 

A: IOS779(29MBO|i DOO 



50 



CAAAC3GTATC GTGCAAGAAT ACGCTATGCC TCTACTACAA ATCTAAGAAT TTACX5TAACG 1620 

GTTCCAGGTG AACGGATTTT TGCTGGTCAA TTTAACAAAA CAATGGATAC CGGTGACCCA 1680 

5 TTAACATTCC AATCTTTTAG TTACGCAACT ATTAATACAG CTTTTACATT CCCAATGAGC 1740 

CAGAGTAGTT TCACAGTAGG TGCTGATACT TTTAGTTCAG GGAATGAAGT TTATATAGAC 1800 

AGATTTGAAT TGATTCCAGT TACTGCAACA CTCGAGGCTG AATATAATCT GGAAAGAGCX5 1860 

10 

CAGAAGGCGG TGAATGCGCT GTTTACGTCT ACAAACCAAC TAGGGCTAAA AACAAATGTA 1920 

ACGGATTATC ATATTGATCA AGTGTCCAAT TTAGTTACGT ATTTATCGGA TGAATTTTGT 1980 

IS . CTGGATGAAA AGCGAGAATT GTCCGAGAAA GTCAAACATG CGAAGCGACT. CAGTGATGAA . 2040 

CGCAATTTAC TCCAAGATTC AAATTTCAAA GACATTAATA GGCAACCAGA ACGTGGGTGG 2100 

GGCGGAAGTA CAGGGATTAC CATCCAAGGA GGGGATGACG TATTTAAAGA AAATTACGTC 2160 

20 

ACACTATCAG GTACCTTTGA TGAGTGCTAT CCAACATATT TGTATCAAAA AATCGATGAA 2220 

TCAAAATTAA AAGCCTTTAC CCGTTATCAA TTAAGAGGGT ATATCGAAGA TAGTCAA6AC 2280 

25 TTAGAAATCT ATTTAATTCG CTACAATGCA AAACATGAAA CAGTAAATGT GCCAGGTACG 2340 

GGTTCCTTAT GGCCGCTTTC AGCCCAAAGT CCAATCGGAA AGTGTGGAGA GCCGAATCGA 2400 

TGCGCGCCAC ACCTTGAATG GAATCCTGAC TTAGATTGTT CGTGTAGGGA TGGAGAAAAG 2460 

30 

TGTGCCCATC ATTCGCATCA TTTCTCCTTA GACATTGATG TAGGATGTAC AGACTTAAAT 2520 

GAGGACCTAG GTGTATGGGT GATCTTTAAG ATTAAGACGC AAGATGGGCA CGCAAGACTA 2580 

3S GGGAATCTAG AGTTTCTCGA AGAGAAACCA TTAGTAGGAG AAGCGCTAGC TCGTGTGAAA 2640 

AGAGCGGAGA AAAAATGGAG AGACAAACGT GAAAAATTGG AATGGGAAAC AAATATCGTT 2700 

TATAAAGAGG CAAAA6AATC TGTAGATGCT TTATTTGTAA ACTCTCAATA TGATCA/CtTA 2760 

40 

CAAGOGGATA GGAATATTGC CATGATTCAT GCGGCAGATA AACX5TGTTCA TAGCATTCGA 2820 

GAAGCTTATC TGCCTGAGCT GTCTGTGATT CCGGGTGTCA ATGCGGCTAT TTTTGAAGAA 2880 

45 TTAGAAGGGC GTATTTTCAC TGCATTCTCC CTATATGATG CGAGAAATGT CATTAAAAAT 2940 

GGTGATTTTA ATAATGGCTT ATCCTGCTGG AACGTGAAAG GGCATGTAGA TGTAGAAGAA 3000 

CAAAACAACC AACGTTCGGT CCTTGTTGTT CCGGAATGGG AAGCAGAAGT GTCACAAGAA 3060 

50 

GTTCGTGTCT GTCCX3GGTCG TGGCTATATC CTTCGTGTCA CAGCGTACAA GGAGGGATAT 3120 

GGAGAAGGTT GCGTAACCAT TCATGAGATC GAGAACAATA CAGACGAACT GAAGTTTAGC 3180 
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AACTGCGTAG AAGAGGAAAT CTATCCAAAT AACACGGTAA CGTGTAATGA TTATACTGTA 3240 

AATCAAGAAG AATACGGAGG TGCGTACACT TCTCGTAATC GAGGATATAA CGAAGCTCCT 3300 

TCCGTACCAG CTGATTATGC GTCAGTCTAT GAAGAAAAAT CGTATACAGA TGGACGAAGA 3360 

6AGAATCCTT GTGAATTTAA CAGAGGGTAT AGGGATTACA CGCCACTACC AGTTGGTTAT 3420 

GTGACAAAAG AATTAGAATA CTTCCCAGAA ACCGATAAGG TATGGATTGA GATTGGAGAA 3480 

ACGGAAGGAA CATTTATCGT GGACAGCGTG GAATTACTCC TTATGGAGGA ATAG 3534 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1177 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 34: 

Met Asp Asn Asn Pro Asn lie Asn Glu Cys lie Pro Tyr Asn Cys Leu 
15 10 15 

Ser Asn Pro Glu Val Glu Val Leu Gly Gly Glu Arg lie Glu Thr Gly 
20 25 30 

Tyr Thr Pro He Asp He Ser Leu Ser Leu Thr Gin Phe Leu Leu Ser 
35 40 45 

Glu Phe Val Pro Gly Ala Gly Phe Val Leu Gly Leu Val Asp He He 
50 55 60 

Trp Gly He Phe Gly Pro Ser Gin Trp Asp Ala Phe I^eu Val Gin He 
65 70 75 80 

Glu Gin Ijeu He Asn Gin Arg He Glu Glu Phe Ala Arg Asn din Ala 
85 90 95 

He Ser Arg Leu Glu Gly Leu Ser Asn Leu Tyr Gin He Tyr Ala Glu 
100 105 110 

Ser Phe Arg Glu Trp Glu Ala Asp Pro Thr Asn Pro Ala Leu Arg Glu 
lis 120 125 

Glu Met Arg He Gin Phe Asn Asp Met Asn Ser Ala Leu Thr Thr Ala 
130 135 140 

He Pro Leu Phe Ala Val Gin Asn Tyr Gin Val Pro Leu Leu Ser Val 
145 150 155 160 



A lOSTTQ(29MBOt1 OOO 



-178- 



Tyr Val Gin Ala Ala Asn Leu His Leu Ser Val Leu Arg Asp Val Ser 
165 170 175 

Val Phe Gly Gin Arg Trp Gly Phe Asp Ala Ala Thr He Asn Ser Arg 
180 185 190 

Tyr Asn Asp Leu Thr Arg Leu He Gly Asn Tyr Thr Asp Tyr Ala Val 
195 200 205 

Arg Trp Tyr Asn Thr Gly Leu Glu Arg Val Trp Gly Pro Asp Ser Arg 
210 215 220 

Asp Trp Val Arg Tyr Asn Gin Phe Arg Arg Glu Leu Thr Leu Thr Val 
225 230 235 240 

Leu Asp He Val Ala Leu Phe Pro Asn Tyr Asp Ser Arg Arg Tyr Pro 
245 250 255 

He Arg Thr Val Ser Gin Leu Thr Arg Glu He Tyr Thr Asn Pro Val 
260 265 270 

Leu Glu Asn Phe Asp Gly Ser Ph6 Arg Gly Ser Ala Gin Gly He Glu 
275 280 285 

Arg Ser He Arg Ser Pro His Leu Met Asp He Leu Asn Ser He Thr 
290 295 . 300 

He Tyr Thr Asp Ala His Arg Gly Tyr Tyr Tyr Trp Ser Gly His Gin 
305 310 315 320 

He Met Ala Ser Pro Val Gly Phe Ser Gly Pro Glu Phe Thr Phe Pro 
325 330 335 

Leu Tyr Gly Thr Met Gly Asn Ala Ala Pro Gin Gin Arg He Val Ala 
340 345 350 

Gin Leu Gly Gin Gly Val Tyr Arg Thr Leu Ser Ser Thr Leu Tyr Arg 
355 360 365 

Arg Pro Phe Asn He Gly He Asn Asn Gin Gin Leu Ser Val Leu Asp 
370 375 . 380 

Gly Thr Glu Phe Ala Tyr Gly Thr Ser Ser Asn Leu Pro Ser Ala Val 
385 390 395 400 

Tyr Arg Lys Ser Gly Thr Val Asp Ser Leu Asp Glu He Pro Pro Gin 
405 410 415 

Asn Asn Asn Val Pro Pro Arg Gin Gly Phe Ser His Arg Leu Ser His 
420 425 430 

Val Ser Met Phe Arg Ser Gly Phe Ser Asn Ser Ser Val Ser He He 
435 440 445 



A IOS7ig(29MBOI< OOn 



-179- 



Arg Ala Pro Met Phe Ser Trp He His Arg Ser Ala Glu Phe Asn Asn 
450 455 460 

He He Ala Ser Asp Ser He Thr Gin He Pro Leu Val Lys Ala His 
465 470 475 480 

Thr Leu Gin Ser Gly Thr Thr Val Val Arg Gly Pro Gly Phe Thr Gly 
485 490 495 

Gly Asp He Leu Arg Arg Thr Ser Gly Gly Pro Phe Ala Tyr Thr He 
500 505 510 

Val Asn He Asn Gly Gin Leu Pro Gin Arg Tyr Arg Ala Arg He Arg 
515 520 525 

Tyr Ala Ser Thr Thr Asn Leu Arg lie Tyr Val Thr Val Ala Gly Glu 
530 535 540 

Arg He Phe Ala Gly Gin Phe Asn Lys Thr Met Asp Thr Gly Asp Pro 
545 550 555 560 

Leu Thr Phe Gin Ser Phe Ser Tyr Ala Thr He Asn Thr Ala Phe Thr 
565 570 575 

Phe Pro Met Ser Gin Ser Ser Phe Thr Val Gly Ala Asp Thr Phe Ser 
580 585 590 

Ser Gly Asn Glu Val Tyr He Asp Arg Phe Glu Leu He Pro Val Thr 
595 600 605 

Ala Thr Leu Glu Ala Glu Tyr Asn Leu Glu Arg Ala Gin Lys Ala Val 
610 615 620 

Asn Ala Leu Phe Thr Ser Thr Asn Gin Leu Gly Leu Lys Thr Asn Val 
625 630 635 640 

Thr Asp Tyr His He Asp Gin Val Ser Asn Leu Val Thr Tyr Leu Ser 
645 650 655 

Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys 
660 665 670 

His Ala Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Ser Asn 
675 680 685 

Phe Lys Asp He Asn Arg Gin Pro Glu Arg Gly Trp Gly Gly Ser Thr 
690 695 700 

Gly He Thr He Gin Gly Gly Asp Asp Val Phe Lys Glu Asn Tyr Val 
705 710 715 720 

Thr Leu Ser Gly Thr Phe Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin 
725 730 735 
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Lys lie Asp Glu Ser Lys Leu Lys Ala Phe Thr Arg Tyr Gin Leu Arg 
740 745 750 



Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr 
755 760 765 

Asn Ala Lys His Glu Thr Val Asn Val Pro Gly Thr Gly Ser Leu Trp 
770 775 780 

Pro Leu Ser Ala Gin Ser Pro He Gly Lys Cys Gly Glu Pro Asn Arg 
785 790 795 800 

Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg 
805 810 815 

Asp Gly Glu Lys Cys Ala His His !ser His His Phe Ser Leu Asp lie 
820 825 830 

Asp Val Gly Cys Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val He 
835 840 845 

Phe Lys He Lys Thr Gin Asp Gly His Ala Arg Leu Gly Asn Leu Glu 
850 855 860 

Phe Leu Glu Glu Lys Pro Leu Val Gly Glu Ala Leu Ala Arg Val Lys 
865 870 875 880 

Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys Leu Glu Trp Glu 
885 • 890 895 

Thr Asn He Val Tyr Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe 
900 905 910 

Val Asn Ser Gin Tyr Asp Gin Leu Gin Ala Asp Thr Asn He Ala Met 
915 920 925 

He His Ala Ala Asp Lys Arg Val His Ser He Arg Glu Ala Tyr Leu 
930 935 940 

Pro Glu Leu Ser Val He Pro Gly Val Asn Ala Ala He Phe Glu Glu 
945 950 955 960 

Leu Qlu Gly Arg He Phe Thr Ala Phe Ser Leu Tyr Asp Ala Arg Asn 
965 970 975 

Val He Lys Asn Gly Asp Phe Asn Asn Gly Leu Ser Cys Trp Asn Val 
980 985 990 

Lys Gly His Val Asp Val Glu Glu Gin Asn Asn Gin Arg Ser Val Leu 
995 1000 1005 

Val Val Pro Glu Trp Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys 
1010 1015 1020 
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Pro Gly Arg Gly Tyr lie Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr 
1025 1030 1035 1040 



Gly Glu Gly Cys Val Thr He His Glu lie Glu Asn Asn Thr Asp Glu 
1045 1050 1055 

Leu Lys Phe Ser Asn Cys Val Glu Glu Glu He Tyr Pro Asn Asn Thr 
1060 1065 1070 

Val Thr Cys Asn Asp Tyr Thr Val Asn Gin Glu Glu Tyr Gly Gly Ala 
1075 1080 1085 

Tyr Thr Ser Arg Asn Arg Gly Tyr Asn Glu Ala Pro Ser Val Pro Ala 
1090 1095 1100 

Asp Tyr Ala Ser Val Tyr Glu Giu Lys Ser Tyr Thr Asp Gly Arg Arg 
1105 1110 1115 1120 

Glu Asn Pro Cys Glu Phe Asn Arg Gly Tyr Arg Asp Tyr Thr Pro Leu 
1125 1130 1135 

Pro Val Gly Tyr Val Thr Lys Glu Leu Glu Tyr Phe Pro Glu Thr Asp 
1140 1145 1150 

Lys Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp 
1155 1160 1165 

Ser Val Glu Leu Leu Leu Met Glu Glu 
1170 1175 



(2) INFORMATION FOR SEQ ID NO:35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
TGCAACACTC GAGGCTGAAT 
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All of the compositions and methods disclosed and claimed herein can be made 
and executed without undue experimentation in light of the present disclosure. While the 
compositions and methods of this invention have been described in terms of preferred 
embodiments, it will be apparent to those of skill in the art that variations may be applied 
to the compositions and methods and in the steps or in the sequence of steps of the 
method described herein without departing from the concept, spirit and scope of the 
invention. Mote specifically, it will be apparent that certain agents which are both 
chemically and physiologically related may be substituted for the agents described herein 
while the same or similar results wbuld be achieved. All such similar substitutes and 
modifications apparent to those skilled in the art are deemed to be within the spirit, scope 
and concept of the invention as defined by the appended claims. 
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